Coding device, decoding device, coding method, and decoding method

ABSTRACT

A down-sampler  101  down-samples the sampling rate of an input signal from sampling rate FH to sampling rate FL. A base layer coder  102  encodes the sampling rate FL acoustic signal. A local decoder  103  decodes coding information output from base layer coder  102 . An up-sampler  104  raises the sampling rate of the decoded signal to FH. A subtracter  106  subtracts the decoded signal from the sampling rate FH acoustic signal. An enhancement layer coder  107  encodes the signal output from subtracter  106  using a decoding result parameter output from local decoder  103.

TECHNICAL FIELD

The present invention relates to a coding apparatus, decoding apparatus,coding method, and decoding method that perform highly efficientcompression coding of an acoustic signal such as an audio signal orspeech signal, and more particularly to a coding apparatus, decodingapparatus, coding method, and decoding method that are suitable forscalable coding and decoding that enable decoding of audio or speecheven from a part of coding information.

BACKGROUND ART

A sound coding technology that compresses an audio signal or speechsignal at a low bit rate is important for efficient utilization of radioin mobile communications and recording media. Methods for speech coding,in which a speech signal is coded, include G726 and G729 standardized bythe ITU (International Telecommunication Union). These methods encodenarrowband signals (300 Hz to 3.4 kHz), and enable high-quality codingat bit rates of 8 kbits/s to 32 kbits/s.

Standard methods for wideband signals (50 Hz to 7 kHz) include the ITU'sG722 and G722.1, and AMR-WB of 3GPP (The 3rd Generation PartnershipProject). These methods enable high-quality coding of wideband speechsignals at bit rates of 6.6 kbits/s to 64 kbits/s.

An effective method of performing highly efficient coding of speechsignals at a low bit rate is CELP (Code Excited Linear Prediction). CELPis a method whereby coding is performed based on a model that simulatesthrough engineering a human voice generation model. To be specific, inCELP, an excitation signal which consists of random values is passed toa pitch filter corresponding to the strength of periodicity and asynthesis filter corresponding to vocal tract characteristics, andcoding parameters are determined so that the square error between theoutput signal and input signal is minimized under auditorycharacteristic weighting.

In many of the latest standard speech coding methods, coding isperformed based on CELP. For example, G729 enables narrowband signalcoding at 8 kbits/s, and AMR-WB enables narrowband signal coding at 6.6kbits/s to 23.85 kbits/s.

Meanwhile, in the case of audio coding that encodes audio signals,methods that convert an audio signal to frequency domain and performcoding using an auditory psychoacoustic model are commonly used, such asthe Layer III method and AAC method standardized by MPEG (Moving PictureExperts Group). It is known that with these methods, almost nodegradation occurs at 64 kbits/s to 96 kbits/s per channel for a signalwith a 44.1 kHz sampling rate.

This audio coding is a method whereby high-quality coding is performedon music. Audio coding can also perform high-quality coding for a speechsignal with music or environmental sound in the background as describedabove, and can handle a signal band of approximately 22 kHz, which is CDquality.

However, when coding is performed using a speech coding method on asignal in which a speech signal is predominant and music orenvironmental sound is superimposed in the background, there is aproblem in that, due to the background music or environmental sound, notonly the background signal but also the speech signal degrades, andoverall quality deteriorates.

This problem occurs because speech coding methods are based on a methodspecialized toward a CELP speech model. There is a problem in thatspeech coding methods can only handle signal bands up to 7 kHz, and asignal that has components in higher bands cannot be handled adequatelyin terms of composition.

Moreover, with an audio coding method, a high bit rate must be used inorder to achieve high-quality coding. With an audio coding method, ifcoding should be performed with the bit rate held down to 32 kbits/s,there is a problem of a major deterioration of decoded signal quality.There is thus a problem in that use is not possible on a communicationnetwork with a low transmission rate.

DISCLOSURE OF INVENTION

It is an object of the present invention to provide a coding apparatus,decoding apparatus, coding method, and decoding method that enablehigh-quality coding and decoding at a low bit rate even of a signal inwhich a speech signal is predominant and music or environmental sound issuperimposed in the background.

This object is achieved by having two layers, a base layer and anenhancement layer, performing high-quality coding at a low bit rate ofan input signal narrowband or wideband frequency region based on CELP inthe base layer, and performing coding in the enhancement layer ofbackground music or environmental sound that cannot be represented inthe base layer, and also signals with higher frequency components thanthe frequency region covered by the base layer.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is a block diagram showing the configuration of a signalprocessing apparatus according to Embodiment 1 of the present invention;

FIG. 2 is a drawing showing an example of input signal components;

FIG. 3 is a drawing showing an example of a signal processing method ofa signal processing apparatus according to the above embodiment;

FIG. 4 is a drawing showing an example of the configuration of a baselayer coder;

FIG. 5 is a drawing showing an example of the configuration of anenhancement layer coder;

FIG. 6 is a drawing showing an example of the configuration of anenhancement layer coder;

FIG. 7 is a drawing showing an example of LPC coefficient calculation inenhancement layer;

FIG. 8 is a block diagram showing the configuration of the enhancementlayer coder of a signal processing apparatus according to Embodiment 3of the present invention;

FIG. 9 is a block diagram showing the configuration of the enhancementlayer coder of a signal processing apparatus according to Embodiment 4of the present invention;

FIG. 10 is a block diagram showing the configuration of a signalprocessing apparatus according to Embodiment 5 of the present invention;

FIG. 11 is a block diagram showing an example of a base layer decoder;

FIG. 12 is a block diagram showing an example of an enhancement layerdecoder;

FIG. 13 is a drawing showing an example of the configuration of anenhancement layer decoder;

FIG. 14 is a block diagram showing the configuration of the enhancementlayer decoder of a signal processing apparatus according to Embodiment 7of the present invention;

FIG. 15 is a block diagram showing the configuration of the enhancementlayer decoder of a signal processing apparatus according to Embodiment 8of the present invention;

FIG. 16 is a block diagram showing the configuration of a sound codingapparatus according to Embodiment 9 of the present invention;

FIG. 17 is a drawing showing an example of acoustic signal informationdistribution;

FIG. 18 is a drawing showing an example of regions subject to coding inthe base layer and enhancement layer;

FIG. 19 is a drawing showing an example of an acoustic (music) signalspectrum;

FIG. 20 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus of the above embodiment;

FIG. 21 is a drawing showing an example of the internal configuration ofthe auditory masking calculator of a sound coding apparatus of the aboveembodiment;

FIG. 22 is a block diagram showing an example of the internalconfiguration of an enhancement layer coder of the above embodiment;

FIG. 23 is a block diagram showing an example of the internalconfiguration of an auditory masking calculator of the above embodiment;

FIG. 24 is a block diagram showing the configuration of a sound decodingapparatus according to Embodiment 9 of the present invention;

FIG. 25 is a block diagram showing an example of the internalconfiguration of the enhancement layer decoder of a sound decodingapparatus of the above embodiment;

FIG. 26 is a block diagram showing an example of the internalconfiguration of a base layer coder of Embodiment 10 of the presentinvention;

FIG. 27 is a block diagram showing an example of the internalconfiguration of a base layer decoder of the above embodiment;

FIG. 28 is a block diagram showing an example of the internalconfiguration of a base layer decoder of the above embodiment;

FIG. 29 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus according to Embodiment 11 of the present invention;

FIG. 30 is a drawing showing an example of a residual error spectrumcalculated by an estimated error spectrum calculator of the aboveembodiment;

FIG. 31 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus according to Embodiment 12 of the present invention;

FIG. 32 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus of the above embodiment;

FIG. 33 is a block diagram showing an example of the internalconfiguration of the enhancement layer coder of a sound coding apparatusaccording to Embodiment 13 of the present invention;

FIG. 34 is a drawing showing an example of ranking of estimateddistortion values by a ordering section of the above embodiment;

FIG. 35 is a block diagram showing an example of the internalconfiguration of the enhancement layer decoder of a sound decodingapparatus according to Embodiment 13 of the present invention;

FIG. 36 is a block diagram showing an example of the internalconfiguration of the enhancement layer coder of a sound coding apparatusaccording to Embodiment 14 of the present invention;

FIG. 37 is a block diagram showing an example of the internalconfiguration of the enhancement layer decoder of a sound decodingapparatus according to Embodiment 14 of the present invention;

FIG. 38 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus of the above embodiment;

FIG. 39 is a block diagram showing an example of the internalconfiguration of the enhancement layer decoder of a sound decodingapparatus according to Embodiment 14 of the present invention;

FIG. 40 is a block diagram showing the configuration of a communicationapparatus according to Embodiment 15 of the present invention;

FIG. 41 is a block diagram showing the configuration of a communicationapparatus according to Embodiment 16 of the present invention;

FIG. 42 is a block diagram showing the configuration of a communicationapparatus according to Embodiment 17 of the present invention; and

FIG. 43 is a block diagram showing the configuration of a communicationapparatus according to Embodiment 18 of the present invention.

BEST MODE FOR CARRYING OUT THE INVENTION

Essentially, the present invention has two layers, a base layer and anenhancement layer, performs high-quality coding at a low bit rate of aninput signal narrowband or wideband frequency region based on CELP inthe base layer, and then performs coding in the enhancement layer ofbackground music or environmental sound that cannot be represented inthe base layer, and also signals with higher frequency components thanthe frequency region covered by the base layer, with the enhancementlayer having a configuration that enables handling of all signals aswith an audio coding method.

By this means, it is possible to perform efficient coding of backgroundmusic or environmental sound that cannot be represented in the baselayer, and also signals with higher frequency components than thefrequency region covered by the base layer. A feature of the presentinvention is that, at this time, enhancement layer coding is performedusing information obtained by base layer coding information. By thismeans, an effect is obtained of being able to keep down the number ofenhancement layer coded bits.

With reference now to the accompanying drawings, embodiments of thepresent invention will be explained in detail below.

Embodiment 1

FIG. 1 is a block diagram showing the configuration of a signalprocessing apparatus according to Embodiment 1 of the present invention.Signal processing apparatus 100 in FIG. 1 mainly comprises adown-sampler 101, base layer coder 102, local decoder 103, up-sampler104, delayer 105, subtracter 106, enhancement layer coder 107, andmultiplexer 108.

Down-sampler 101 down-samples the input signal sampling rate fromsampling rate FH to sampling rate FL, and outputs the sampling rate FLacoustic signal to base layer coder 102. Here, sampling rate FL is alower frequency than sampling rate FH.

Base layer coder 102 encodes the sampling rate FL acoustic signal andoutputs the coding information to local decoder 103 and multiplexer 108.

Local decoder 103 decodes the coding information output from base layercoder 102, outputs the decoded signal to up-sampler 104, and outputsparameters obtained from the decoded result to enhancement layer coder107.

Up-sampler 104 raises the decoded signal sampling rate to FH, andoutputs the result to subtracter 106.

Delayer 105 delays the input sampling rate FH acoustic signal by apredetermined time, then outputs the signal to subtracter 106. By makingthis delay time equal to the time delay arising in down-sampler 101,base layer coder 102, local decoder 103, and up-sampler 104, phase shiftis prevented in the following subtraction processing.

Subtracter 106 subtracts the decoded signal from the sampling rate FHacoustic signal, and outputs the result of the subtraction toenhancement layer coder 107.

Enhancement layer coder 107 encodes the signal output from subtracter106 using the decoding result parameters output from local decoder 103,and outputs the resulting signal to multiplexer 108. Multiplexer 108multiplexes and outputs the signals coded by base layer coder 102 andenhancement layer coder 107.

Base layer coding and enhancement layer coding will now be explained.FIG. 2 is a drawing showing an example of input signal components. InFIG. 2, the vertical axis indicates the signal component informationamount, and the horizontal axis indicates frequency. FIG. 2 shows thefrequency bands in which speech information and backgroundmusic/background noise information contained in the input signal arepresent.

In the case of speech information, there is a large amount ofinformation in the low frequency region, and the amount of informationdecreases the higher the frequency region. Conversely, in the case ofbackground music and background noise information, there iscomparatively little information in the lower region compared withspeech information, and a large amount of information in the higherregion.

Thus, a signal processing apparatus of the present invention uses aplurality of coding methods, and performs different coding for eachregion for which the respective coding methods are appropriate.

FIG. 3 is a drawing showing an example of a signal processing method ofa signal processing apparatus according to this embodiment. In FIG. 3,the vertical axis indicates the signal component information amount, andthe horizontal axis indicates frequency.

Base layer coder 102 is designed to represent efficiently speechinformation in the frequency band from 0 to FL, and can performgood-quality coding of speech information in this region. However, thecoding quality of background music and background noise information inthe frequency band from 0 to FL is not high. Enhancement layer coder 107encodes portions that cannot be coded by base layer coder 102, andsignals in the frequency band from FL to FH.

Thus, by combining base layer coder 102 and enhancement layer coder 107,it is possible to achieve high-quality coding in a wide band. Moreover,a scalable function can be implemented whereby speech information can bedecoded even with only coding information of at least a base layercoding section.

In this way, a useful parameter from among those generated by coding inlocal decoder 103 is supplied to enhancement layer coder 107, andenhancement layer coder 107 performs coding using this parameter.

As this parameter is generated from coding information, when a signalcoded by a signal processing apparatus of this embodiment is decoded,the same parameter can be obtained in the sound decoding process, and itis not necessary to add this parameter for transmission to the decodingside. As a result, the enhancement layer coding section can achieveefficient coding processing without incurring an increase in additionalinformation.

For example, there is a configuration whereby, of the parameters decodedby local decoder 103, a voiced/unvoiced flag, indicating whether aninput signal is a signal with marked periodicity such as a vowel or asignal with marked noise characteristics such as a consonant, is used asa parameter employed by enhancement layer coder 107. It is possible toperform adaptation using the voiced/unvoiced flag, such as performingbit allocation stressing the lower region more than the higher region inthe enhancement layer in a voiced section, and performing bit allocationstressing the higher region more than the lower region in an unvoicedsection.

Thus, according to a signal processing apparatus of this embodiment, byextracting components not exceeding a predetermined frequency from aninput signal and performing coding suitable for speech coding, andperforming coding suitable for audio coding using the results ofdecoding the obtained coding information, it is possible to performhigh-quality coding at a low bit rate.

For sampling rates FH and FL, it is only necessary for FH to be highervalue than FL, and there are no restrictions on the values. For example,coding can be performed with sampling rates of FH=24 kHz and FL=16 kHz.

Embodiment 2

In this embodiment an example is described in which, of the parametersdecoded by local decoder 103 of Embodiment 1, LPC coefficientsindicating the input signal spectrum is used as a parameter utilized byenhancement layer coder 107.

A signal processing apparatus of this embodiment performs coding usingCELP in base layer coder 102 in FIG. 1, and performs coding using LPCcoefficients indicating the input signal spectrum in enhancement layercoder 107.

A detailed description of the operation of base layer coder 102 willfirst be given, followed by a description of the basic configuration ofenhancement layer coder 107. The “basic configuration” mentioned here isintended to simplify the descriptions of subsequent embodiments, anddenotes a configuration that does not use local decoder 103 codingparameters. Thereafter, a description is given of enhancement layercoder 107, which uses the LPC coefficients decoded by local decoder 103,this being a feature of this embodiment.

FIG. 4 is a drawing showing an example of the configuration of baselayer coder 102. Base layer coder 102 mainly comprises an LPC analyzer401, weighting section 402, adaptive code book search unit 403, adaptivegain quantizer 404, target vector generator 405, noise code book searchunit 406, noise gain quantizer 407, and multiplexer 408.

LPC analyzer 401 obtains LPC coefficients from the input signal sampledat sampling rate FL by down-sampler 101, and outputs these LPCcoefficients to weighting section 402.

Weighting section 402 performs weighting on the input signal based onthe LPC coefficients obtained by LPC analyzer 401, and outputs theweighted input signal to adaptive code book search unit 403, adaptivegain quantizer 404, and target vector generator 405.

Adaptive code book search unit 403 carries out an adaptive code booksearch with the weighted input signal as the target signal, and outputsthe retrieved adaptive vector to adaptive gain quantizer 404 and targetvector generator 405. Adaptive code book search unit 403 then outputsthe code of the adaptive vector determined to have the leastquantization distortion to multiplexer 408.

Adaptive gain quantizer 404 quantizes the adaptive gain that ismultiplied by the adaptive vector output from adaptive code book searchunit 403, and outputs the result to target vector generator 405. Thiscode is then output to multiplexer 408.

Target vector generator 405 performs vector subtraction of the inputsignal output from weighting section 402 from the result of multiplyingthe adaptive vector by the adaptive gain, and outputs the result of thesubtraction to noise code book search unit 406 and noise gain quantizer407 as the target vector.

Noise code book search unit 406 retrieves from a noise code book thenoise vector for which distortion relative to the target vector outputfrom target vector generator 405 is smallest. Noise code book searchunit 406 then supplies the retrieved noise vector to noise gainquantizer 407 and also outputs that code to multiplexer 408.

Noise gain quantizer 407 quantizes noise gain that is multiplied by thenoise vector retrieved by noise code book search unit 406, and outputsthat code to multiplexer 408.

Multiplexer 408 multiplexes the LPC coefficients, adaptive vector,adaptive gain, noise vector, and noise gain coding information, andoutputs the resulting signal to local decoder 103 and multiplexer 108.

Next, the operation of base layer coder 102 in FIG. 4 will be described.First, a sampling rate FL signal output from down-sampler 101 is input,and LPC coefficients are obtained by LPC analyzer 401. The LPCcoefficients are converted to a parameter suitable for quantization suchas LSP coefficients, and quantized. The coding information obtained bythis quantization is supplied to multiplexer 408, and the quantized LSPcoefficients are calculated from the coding information and converted toLPC coefficients.

By means of this quantization, the quantized LPC coefficients areobtained. Using the quantized LPC coefficients, adaptive code book,adaptive gain, noise code book, and noise gain coding is performed.

Weighting section 402 then performs weighting on the input signal basedon the LPC coefficients obtained by LPC analyzer 401. The purpose ofthis weighting is to perform spectrum shaping so that the quantizationdistortion spectrum is masked by the spectral envelope of the inputsignal.

The adaptive code book is then searched by adaptive code book searchunit 403 with the weighted input signal as the target signal. A signalin which a past excitation sequence is repeated on a pitch period basisis called an adaptive vector, and an adaptive code book is composed ofadaptive vectors generated at pitch periods of a predetermined range.

If a weighted input signal is designated t(n), and a signal in which animpulse response of a weighted synthesis filter comprising the LPCcoefficients is convoluted to the adaptive vector of pitch period i isdesignated pi(n), then pitch period i of the adaptive vector for whichevaluation function D of Equation (1) below is minimized is sent tomultiplexer 408 as a parameter. $\begin{matrix}{D = {{\sum\limits_{n = 0}^{N - 1}{t^{2}(n)}} - \frac{\left( {\sum\limits_{n = 0}^{N - 1}{{t(n)}\quad p\quad{i(n)}}} \right)^{2}}{\sum\limits_{n = 0}^{N - 1}{p\quad{i^{2}(n)}}}}} & (1)\end{matrix}$Here, N indicates the vector length.

Next, quantization of the adaptive gain that is multiplied by theadaptive vector is performed by adaptive gain quantizer 404. Adaptivegain β is expressed by Equation (2). This β value undergoes scalarquantization, and the resulting code is sent to multiplexer 408.$\begin{matrix}{\beta = \frac{\sum\limits_{n = 0}^{N - 1}{{t(n)}\quad p\quad{i(n)}}}{\sum\limits_{n = 0}^{N - 1}{p\quad{i^{2}(n)}}}} & (2)\end{matrix}$

The effect of the adaptive vector is then subtracted from the inputsignal by target vector generator 405, and the target vector used bynoise code book search unit 406 and noise gain quantizer 407 isgenerated. If pi(n) here designates a signal in which the synthesisfilter is convoluted to the adaptive vector when evaluation function Dexpressed by Equation (1) is minimized, and βq designates thequantization value when adaptive vector βexpressed by Equation (2)undergoes scalar quantization, then target vector t2(n) is expressed byEquation (3) below.t2(n)=t(n)−βq·pi(n)  (3)

Aforementioned target vector t2(n) and the LPC coefficients are suppliedto noise code book search unit 406, and a noise code book search iscarried out.

Here, a typical composition of the noise code book with which noise codebook search unit 406 is provided is algebraic. In an algebraic codebook, an amplitude 1 pulse is represented by a vector that has only apredetermined extremely small number. Also, with an algebraic code book,positions that can be held for each phase are decided beforehand so asnot to overlap. Thus, a feature of an algebraic code book is that anoptimal combination of pulse position and pulse code (polarity) can bedetermined by a small amount of computation.

If the target vector is designated t2(n), and a signal in which animpulse response of a weighted synthesis filter is convoluted to thenoise vector corresponding to code j is designated cj(n), then index jof the noise vector for which evaluation function D of Equation (4)below is minimized is sent to multiplexer 408 as a parameter.$\begin{matrix}{D = {{\sum\limits_{n = 0}^{N - 1}{{t2}^{2}(n)}} - \frac{\left( {\sum\limits_{n = 0}^{N - 1}{{{t2}(n)}\quad{{cj}(n)}}} \right)^{2}}{\sum\limits_{n = 0}^{N - 1}{{cj}^{2}(n)}}}} & (4)\end{matrix}$

Next, quantization of the noise gain that is multiplied by the noisevector is performed by noise gain quantizer 407. Adaptive gain γ isexpressed by Equation (5). This γvalue undergoes scalar quantization,and the resulting code is sent to multiplexer 408. $\begin{matrix}{\gamma = \frac{\sum\limits_{n = 0}^{N - 1}{{{t2}(n)}\quad{{cj}(n)}}}{\sum\limits_{n = 0}^{N - 1}{{cj}^{2}(n)}}} & (5)\end{matrix}$

Multiplexer 408 multiplexes the sent LPC coefficients, adaptive codebook, adaptive gain, noise code book, and noise gain coding information,and outputs the resulting signal to local decoder 103 and multiplexer108.

The above processing is repeated while there is a new input signal. Whenthere is no new input signal, processing is terminated.

Enhancement layer coder 107 will now be described. FIG. 5 is a drawingshowing an example of the configuration of enhancement layer coder 107.Enhancement layer coder 107 in FIG. 5 mainly comprises an LPC analyzer501, spectral envelope calculator 502, MDCT section 503, powercalculator 504, power normalizer 505, spectrum normalizer 506, Barkscale normalizer 508, Bark scale shape calculator 507, vector quantizer509, and multiplexer 510.

LPC analyzer 501 performs LPC analysis on an input signal. And the LPCanalyzer 501 quantizes the LPC coefficients effectively in the domain ofLSP or other adequate parameter for quantization, and the LPC analyzeroutputs the coding information to multiplexer, and the LPC analyzeroutputs the quantized LPC coefficients to spectral envelope calculator502. Spectral envelope calculator 502 calculates a spectral envelopefrom the quantized LPC coefficients, and outputs this spectral envelopeto vector quantizer 509.

MDCT section 503 performs MDCT (Modified Discrete Cosine Transform)processing on the input signal, and outputs the obtained MDCTcoefficients to power calculator 504 and power normalizer 505. Powercalculator 504 finds and quantizes the power of the MDCT coefficients,and outputs the quantized power to power normalizer 505 and the codinginformation to multiplexer 510.

Power normalizer 505 normalizes the MDCT coefficients with the quantizedpower, and outputs the power-normalized MDCT coefficients to spectrumnormalizer 506. Spectrum normalizer 506 normalizes the MDCT coefficientsnormalized according to the power using the spectral envelope, andoutputs the normalized MDCT coefficients to Bark scale shape calculator507 and Bark scale normalizer 508.

Bark scale shape calculator 507 calculates the shape of a spectrumband-divided at equal intervals by means of a Bark scale, then quantizesthis spectrum shape, and outputs the quantized spectrum shape to Barkscale normalizer 508, vector quantizer 509. And the bark scale shapecalculator 507 outputs the coding information to multiplexer 510.

Bark scale normalizer normalizes the normalized MDCT coefficients usingquantized bark scale shape, which it outputs to vector quantizer 509.

Vector quantizer 509 performs vector quantization of the normalized MDCTcoefficients output from Bark scale normalizer 508, finds thecode-vector at which distortion is smallest, and outputs the index ofthe code-vector to multiplexer 510 as coding information.

Multiplexer 510 multiplexes all of the coding information, and outputsthe resulting signal to multiplexer 108.

The operation of enhancement layer coder 107 in FIG. 5 will now bedescribed. The subtraction signal obtained by subtracter 106 in FIG. 1undergoes LPC analysis by LPC analyzer 501. Then the LPC coefficientsare calculated by LPC analysis. The LPC coefficients are converted to aparameter suitable for quantization such as LSP coefficients, afterwhich quantization is performed. Coding information related to the LPCcoefficients obtained here is supplied to multiplexer 510.

Spectral envelope calculator 502 calculates a spectral envelope inaccordance with Equation (6) below, based on the decoded LPCcoefficients. $\begin{matrix}{{{env}(m)} = {\frac{1}{1 - {\sum\limits_{i = 1}^{NP}{{\alpha_{q}(i)}\quad{\mathbb{e}}^{{- j}\quad\frac{2\pi\quad m\quad i}{M}}}}}}} & (6)\end{matrix}$

Here, aq denotes the decoded LPC coefficients, NP indicates the order ofthe LPC coefficients, and M the spectral resolution. Spectral envelopeenv(m) obtained by means of Equation (6) is used by spectrum normalizer506 and vector quantizer 509 described later herein.

The input signal then undergoes MDCT processing in MDCT section 503, andthe MDCT coefficients are obtained. A feature of MDCT processing is thatframe boundary distortion does not occur because of the use of anorthogonal base whereby the analysis frame of successive frames arecompletely superimposed one-half at a time, and the first half of theanalysis frame is an odd function while the latter half of the analysisframe is an even function. When MDCT processing is performed, the inputsignal is multiplied by a window function such as a sin window.Designating the MDCT coefficients X(m), the MDCT coefficients arecalculated in accordance with Equation (7) below. $\begin{matrix}{{X(m)} = {\sqrt{\frac{1}{N}}{\sum\limits_{n = 0}^{{2N} - 1}{{x(n)}\quad\cos\left\{ \frac{{\left( {{2n} + 1 + N} \right) \cdot \left( {{2m} + 1} \right)}\quad\pi}{4N} \right\}}}}} & (7)\end{matrix}$Here, x(n) indicates the signal when the input signal is multiplied by awindow function.

Next, power calculator 504 finds and quantizes the power of MDCTcoefficients X(m). Power normalizer 505 then normalizes the MDCTcoefficients with the power after that quantization using Equation (8).$\begin{matrix}{{pow} = {\sum\limits_{m = 0}^{M - 1}{X(m)}^{2}}} & (8)\end{matrix}$

Here, M indicates the size of the MDCT coefficients. After MDCTcoefficient power pow has been quantized, the coding information is sentto multiplexer 510. The power of the MDCT coefficients is decoded usingthe coding information, and the MDCT coefficients are normalized inaccordance with Equation (9) below using the resulting value.$\begin{matrix}{{{X1}(m)} = \frac{X(m)}{\sqrt{powq}}} & (9)\end{matrix}$Here, X1 (m) represents the MDCT coefficients after power normalization,and powq indicates the power of the MDCT coefficients afterquantization.

Spectrum normalizer 506 then normalizes the MDCT coefficients that hasbeen normalized according to power using the spectral envelope. Spectrumnormalizer 506 performs normalization in accordance with Equation (10)below. $\begin{matrix}{{{X2}(m)} = \frac{{X1}(m)}{{env}(m)}} & (10)\end{matrix}$

Next, Bark scale shape calculator 507 calculates the shape of a spectrumband-divided at equal intervals by means of a Bark scale, then quantizesthis spectrum shape. Bark scale shape calculator 507 sends this codinginformation to multiplexer 510, and also performs normalization of MDCTcoefficients X2 (m), which is the output signal from spectrum normalizer506, using the decoded value. The correspondence between the Bark scaleand Herz scale is given by the conversion expression represented byEquation (11) below. $\begin{matrix}{B = {{13\quad{\tan^{- 1}\left( {0.76\quad f} \right)}} = {3.5\quad{\tan^{- 1}\left( \frac{f}{7.5} \right)}}}} & (11)\end{matrix}$Here, B indicates the Bark scale and f the Herz scale. Bark scale shapecalculator 507 calculates a shape in accordance with Equation (12) belowfor the sub-bands band-divided at equal intervals on the Bark scale.$\begin{matrix}{{B(k)} = {{\sum\limits_{m = {f\quad l\quad{(k)}}}^{f\quad{h{(k)}}}{{{X2}(m)}^{2}\quad 0}} \leq k < K}} & (12)\end{matrix}$Here, fl(k) indicates the lowest frequency of the k'th sub-band andfh(k) the highest frequency of the k'th sub-band, and K indicates thenumber of sub-bands.

Bark scale shape calculator 507 then quantizes Bark scale shape B(k) ofeach band and sends the coding information to multiplexer 510, and alsodecodes the Bark scale shape and supplies the result to Bark scalenormalizer 508 and vector quantizer 509. Using the Bark scale shapeafter normalization, Bark scale normalizer 508 generates normalized MDCTcoefficients X3(m) in accordance with Equation (13) below.$\begin{matrix}{{{X3}(m)} = {{\frac{{X2}(m)}{\sqrt{B_{q}(k)}}\quad f\quad l\quad(k)} \leq m \leq {f\quad h\quad(k)\quad 0} \leq k < K}} & (13)\end{matrix}$Here, Bq(k) indicates the Bark scale shape after quantization of thek'th sub-band.

Next, vector quantizer 509 performs vector quantization of Bark scalenormalizer 508 output X3 (m) Vector quantizer 509 divides X3(m) into aplurality of vectors and finds the code-vector at which distortion issmallest using a code book corresponding to each vector, and sends thisindex to multiplexer 510 as coding information.

When performing vector quantization, vector quantizer 509 determines twoimportant parameters using input signal spectrum information. One ofthese parameters is quantization bit allocation, and the other is codebook search weighting. Quantization bit allocation is determined usingspectral envelope env(m) obtained by spectral envelope calculator 502.

When quantization bit allocation is determined using spectral envelopeenv(m), a setting can also be made so that the number of bits allocatedin the spectrum corresponding to frequencies 0 to FL is made small.

One example of implementation of this is a method whereby the maximumnumber of bits that can be allocated in frequencies 0 to FL,MAX_LOWBAND_BIT, is set, and a restriction is imposed so that themaximum number of bits allocated in this band does not exceed maximumnumber of bits MAX_LOWBAND_BIT.

In this implementation example, since coding has already been performedin the base layer at frequencies 0 to FL, it is not necessary toallocate a large number of bits, and overall quality can be improved byperforming quantization with quantization in this band intentionallymade coarse and bit allocation kept low, and the extra bits beingallocated to frequencies FL to FH. A configuration may also be usedwhere by this bit allocation is determined by combining spectralenvelope env(m) and aforementioned Bark scale shape Bq(k).

Vector quantization is performed using a distortion measure employingspectral envelope env(m) obtained by spectral envelope calculator 502and weighting calculated from quantized Bark scale shape Bq(k) obtainedby Bark scale shape calculator 507. Vector quantization is implementedby finding index j of code vector C for which distortion D stipulated byEquation (14) below is minimal. $\begin{matrix}{D = {\sum\limits_{m}{{w(m)}^{2}\left( {{C_{j}(m)} - {{X3}(m)}} \right)^{2}}}} & (14)\end{matrix}$Here, w(m) indicates the weighting function.

Weighting function w(m) can be expressed as shown in Equation (15) belowusing spectral envelope env(m) and Bark scale shape Bq(k).w(m)=(env(m) ·Bq(Herz_to_Bark(m)))^(p)  (15)Here, p indicates a constant between 0 and 1, and Herz_to_Bark( )indicates a function that converts from the Herz scale to Bark scale.

When weighting function w(m) is determined, it is also possible to makea setting so that the weighting function for bit allocation to thespectrum corresponding to frequencies 0 to FL is made small. One exampleof implementation of this is a method whereby the maximum value possiblefor weighting function w(m) corresponding to frequencies 0 to FL is setbelow as MAX_LOWBAND_WGT, and a restriction is imposed so that the valueof weighting function w(m) for this band does not exceedMAX_LOWBAND_WGT. In this implementation example, coding has already beenperformed in the base layer at frequencies 0 to FL, and overall qualitycan be improved by intentionally lowering the quantization precision inthis band and relatively raising the quantization precision forfrequencies FL to FH.

Lastly, multiplexer 510 multiplexes the coding information and outputsthe resultant signal to multiplexer 108. The above processing isrepeated while there is a new input signal. When there is no new inputsignal, processing is terminated.

Thus, according to a signal processing apparatus of this embodiment, byextracting components not exceeding a predetermined frequency from aninput signal and performing coding using code excited linear prediction,and performing coding by MDCT processing using the results of decodingobtained coding information, it is possible to perform high-qualitycoding at a low bit rate.

An example has been described above in which the LPC coefficients areanalyzed from a subtraction signal obtained by subtracter 106, but asignal processing apparatus of the present invention may also performdecoding using the LPC coefficients decoded by local decoder 103.

FIG. 6 is a drawing showing an example of the configuration ofenhancement layer coder 107. Parts in FIG. 6 identical to those in FIG.5 are assigned the same reference numerals as in FIG. 5 and detaileddescriptions thereof are omitted.

Enhancement layer coder 107 in FIG. 6 differs from enhancement layercoder 107 in FIG. 5 in being provided with a conversion table 601, LPCcoefficient mapping section 602, spectral envelope calculator 603, andtransformation section 604, and performing coding using the LPCcoefficients decoded by local decoder 103.

Conversion table 601 stores base layer LPC coefficients and enhancementlayer LPC coefficients with the correspondence therebetween indicated.

LPC coefficient mapping section 602 references conversion table 601,converts the base layer LPC coefficients input from local decoder 103 tothe enhancement layer LPC coefficients, and outputs the enhancementlayer LPC coefficients to spectral envelope calculator 603.

Spectral envelope calculator 603 obtains a spectral envelope based onthe enhancement layer LPC coefficients, and outputs this spectralenvelope to transformation section 604. Transformation section 604transforms the spectral envelope and outputs the result to spectrumnormalizer 506 and vector quantizer 509.

The operation of enhancement layer coder 107 in FIG. 6 will now bedescribed. The base layer LPC coefficients are found for signals insignal band 0 to FL, and does not coincide with the LPC coefficientsused by an enhancement layer signal (signal band 0 to FH). However,there is a strong correlation between the two. Therefore, in LPCcoefficient mapping section 602, a conversion table 601 is separatelydesigned in advance, showing the correspondence between LPC coefficientsfor signal band 0 to FL signals and signal band 0 to FH signals, usingthis correlation. This conversion table 601 is used to find theenhancement layer LPC coefficients from the base layer LPC coefficients.

FIG. 7 is a drawing showing an example of enhancement layer LPCcoefficient calculation. Conversion table 601 is composed of Jcandidates {Yj(m)} indicating the enhancement layer LPC coefficients(order M), and candidates {yj(k)} that have the same order (=K) as thebase layer LPC coefficients assigned correspondence to {Yj(m)}. {Yj(m)}and {yj(k)} are designed and provided beforehand from large-scale audioand speech data, etc. When base layer LPC coefficients x(k) are input,the sequence of the LPC coefficients most similar to x(k) is found fromamong {yj(k)}. By outputting enhancement layer LPC coefficients Yj(m)corresponding to index j of the LPC coefficients determined to be mostsimilar, it is possible to implement mapping of the enhancement layerLPC coefficients from base layer LPC coefficients.

Next, spectral envelope calculator 603 obtains a spectral envelope basedon the enhancement layer LPC coefficients found in this way. Then thisspectral envelope is transformed by transformation section 604. Thistransformed spectral envelope is then regarded as a spectral envelope ofthe implementation example described above, and is processedaccordingly.

One example of implementation of transformation section 604 thattransforms a spectral envelope is processing whereby the effect of aspectral envelope corresponding to signal band 0 to FL subject to baselayer coding is made small. If the spectral envelope is designatedenv(m), transformed spectral envelope env′(m) is expressed by Equation(16) below. $\begin{matrix}{{{env}^{\prime}(m)} = \left\{ \begin{matrix}{{env}(m)}^{p} & {{{if}\quad 0} \leq m \leq {Fl}} \\{{env}(m)} & {else}\end{matrix} \right.} & (16)\end{matrix}$Here, p indicates a constant between 0 and 1.

Coding has already been performed in the base layer at frequencies 0 toFL, and the spectrum of frequencies 0 to FL of a subtraction signalsubject to enhancement layer coding is close to flat. Irrespective ofthis, such action is not considered in LPC coefficient mapping asdescribed in this implementation example. Quality can therefore beimproved by using a technique of correcting the spectral envelope usingEquation (16).

Thus according to a signal processing apparatus of this embodiment, byfinding the enhancement layer LPC coefficients using the LPCcoefficients quantized by a base layer quantizer, and calculating aspectral envelope from enhancement layer LPC analysis, LPC analysis andquantization are made unnecessary, and the number of quantization bitscan be reduced.

Embodiment 3

FIG. 8 is a block diagram showing the configuration of the enhancementlayer coder of a signal processing apparatus according to Embodiment 3of the present invention. Parts in FIG. 8 identical to those in FIG. 5are assigned the same reference numerals as in FIG. 5 and detaileddescriptions thereof are omitted.

Enhancement layer coder 107 in FIG. 8 differs from the enhancement layercoder in FIG. 5 in being provided with a spectral fine structurecalculator 801, calculating spectral fine structure using a pitch periodcoded by base layer coder 102 and decoded by local decoder 103, andemploying that spectral fine structure in spectrum normalization andvector quantization.

Spectral fine structure calculator 801 calculates the spectral finestructure from pitch period T and pitch gain β coded in the base layer,and outputs the spectral fine structure to spectrum normalizer 506.

The aforementioned pitch period T and pitch gain β are actually parts ofthe coding information, and the same information can be obtained by alocal decoder (shown in FIG. 1). Thus the bit rate does not increaseeven if coding is performed using pitch period T and pitch gain β.

Using pitch period T and pitch gain β, spectral fine structurecalculator 801 calculates spectral fine structure har(m) in accordancewith Equation (17) below. $\begin{matrix}{{{har}(m)} = {\frac{1}{1 - {\beta \cdot {\mathbb{e}}^{{- j}\frac{2\quad\pi\quad m\quad T}{M}}}}}} & (17)\end{matrix}$Here, M indicates the spectral resolution. As Equation (17) is anoscillation filter when the absolute value of β is greater than or equalto 1, there is also a method whereby a restriction is set so that thepossible range of the absolute value of β is less than or equal to apredetermined set value less than 1 (for example, 0.8).

Spectrum normalizer 506 performs normalization in accordance withEquation (18) below, using both spectral envelope env(m) obtained byspectral envelope calculator 502 and spectral fine structure har(m)obtained by spectral fine structure calculator 801. $\begin{matrix}{{{X2}(m)} = \frac{{X1}(m)}{{{env}(m)} \cdot {{har}(m)}}} & (18)\end{matrix}$

The allocation of quantization bits by vector quantizer 509 is alsodetermined using both spectral envelope env (m) obtained by spectralenvelope calculator 502 and spectral fine structure har(m) obtained byspectral fine structure calculator 801. The spectral fine structure isalso used in weighting function w(m) determination in vectorquantization. To be specific, weighting function w(m) is defined inaccordance with Equation (19) below.w(m)=(env(m)·har(m)·Bq(Herz_to_Bark(m)))^(p)  (19)Here, p indicates a constant between 0 and 1, and Herz_to_Bark( )indicates a function that converts from the Herz scale to Bark scale.

Thus, according to a signal processing apparatus of this embodiment, bycalculating a spectral fine structure using a pitch period coded by abase layer coder and decoded by a local decoder, and using that spectralfine structure in spectrum normalization and vector quantization,quantization performance can be improved.

Embodiment 4

FIG. 9 is a block diagram showing the configuration of the enhancementlayer coder of a signal processing apparatus according to Embodiment 4of the present invention. Parts in FIG. 9 identical to those in FIG. 5are assigned the same reference numerals as in FIG. 5 and detaileddescriptions thereof are omitted.

Enhancement layer coder 107 in FIG. 9 differs from the enhancement layercoder in FIG. 5 in being provided with a power estimation unit 901 andpower fluctuation amount quantizer 902, and in generating a decodedsignal in local decoder 103 using coding information obtained by baselayer coder 102, predicting MDCT coefficients power from that decodedsignal, and coding the amount of fluctuation from that predicted value.

In FIG. 1 a decoded parameter is output from local decoder 103 toenhancement layer coder 107, but in this embodiment a decoded signalobtained by local decoder 103 is output to enhancement layer coder 107instead of a decoded parameter.

Signal sl(n) decoded by local decoder 103 in FIG. 5 is input to powerestimation unit 901. Power estimation unit 901 then estimates the MDCTcoefficient power from this decoded signal sl(n). If the MDCTcoefficient power estimate is designated powp, powp is expressed byEquation (20) below. $\begin{matrix}{{powp} = {\alpha \cdot {\sum\limits_{n = 0}^{N - 1}\quad{s\quad{l(n)}^{2}}}}} & (20)\end{matrix}$Here, N indicates the length of decoded signal sl(n), and α indicates apredetermined constant for correction. In another method that usesspectrum tilt found from the base layer LPC coefficients, an MDCTcoefficient power estimate is expressed by Equation (21) below.$\begin{matrix}{{powp} = {\alpha \cdot \beta \cdot {\sum\limits_{n = 0}^{N - 1}\quad{s\quad{l(n)}^{2}}}}} & (21)\end{matrix}$Here, β denotes a variable that depends on the spectrum tilt found fromthe base layer LPC coefficients, having a property of approaching zerowhen the spectrum tilt is large (when an amount of spectral energy isbig in low band), and approaching 1 when the spectrum tilt is small(when there is power in a relatively high region)

Next, power fluctuation amount quantizer 902 normalizes the power of theMDCT coefficients obtained by MDCT section 503 by means of powerestimate powp obtained by power estimation unit 901, and quantizes thefluctuation amount. fluctuation amount r is expressed by Equation (22)below. $\begin{matrix}{r = \frac{pow}{powp}} & (22)\end{matrix}$Here, pow indicates the MDCT coefficient power, and is calculated bymeans of Equation (23). $\begin{matrix}{{pow} = {\sum\limits_{m = 0}^{M - 1}{X(m)}^{2}}} & (23)\end{matrix}$Here, X(m) indicates the MDCT coefficients, and M indicates the framelength. Power fluctuation amount quantizer 902 quantizes fluctuationamount r, sends the coding information to multiplexer 510, and alsodecodes quantized fluctuation amount rq. Using quantized fluctuationamount rq, power normalizer 505 normalizes the MDCT coefficients usingEquation (24) below. $\begin{matrix}{{{X1}(m)} = \frac{X(m)}{\sqrt{r\quad{q \cdot {powp}}}}} & (24)\end{matrix}$Here, X1 (m) indicates the MDCT coefficients after power normalization.

Thus, according to a signal processing apparatus of this embodiment, byusing the correlation between base layer decoded signal power andenhancement layer MDCT coefficient power, predicting MDCT coefficientpower using a base layer decoded signal, and coding the amount offluctuation from that predicted value, it is possible to reduce thenumber of bits necessary for MDCT coefficient power quantization.

Embodiment 5

FIG. 10 is a block diagram showing the configuration of a signalprocessing apparatus according to Embodiment 5 of the present invention.Signal processing apparatus 1000 in FIG. 10 mainly comprises ademultiplexer 1001, base layer decoder 1002, up-sampler 1003,enhancement layer decoder 1004, and adder 1005.

Demultiplexer 1001 separates coding information, and generates baselayer coding information and enhancement layer coding information. Thendemultiplexer 1001 outputs base layer coding information to base layerdecoder 1002, and outputs enhancement layer coding information toenhancement layer decoder 1004.

Base layer decoder 1002 decodes a sampling rate FL decoded signal usingthe base layer coding information obtained by demultiplexer 1001, andoutputs the resulting signal to up-sampler 1003. At the same time, aparameter decoded by base layer decoder 1002 is output to enhancementlayer decoder 1004. Up-sampler 1003 raises the decoded signal samplingfrequency to FH, and outputs this to adder 1005.

Enhancement layer decoder 1004 decodes the sampling rate FH decodedsignal using the enhancement layer coding information obtained bydemultiplexer 1001 and the parameter decoded by base layer decoder 1002,and outputs the resulting signal to adder 1005.

Adder 1005 performs addition of the decoded signal output fromup-sampler 1003 and the decoded signal output from enhancement layerdecoder 1004.

The operation of a signal processing apparatus of this embodiment willbe now described. First, code coded in a signal processing apparatus ofany of Embodiments 1 through 4 is input, and that code is separated bydemultiplexer 1001, generating base layer coding information andenhancement layer coding information.

Next, base layer decoder 1002 decodes a sampling rate FL decoded signalusing the base layer coding information obtained by demultiplexer 1001.Then up-sampler 1003 raises the sampling frequency of that decodedsignal to FH.

In enhancement layer decoder 1004, the sampling rate FH decoded signalis decoded using enhancement layer coding information obtained bydemultiplexer 1001 and a parameter decoded by base layer decoder 1002.

The base layer decoded signal up-sampled by up-sampler 1003 and theenhancement layer decoded signal are added by adder 1005. The aboveprocessing is repeated while there is a new input signal. When there isno new input signal, processing is terminated.

Thus, according to a signal processing apparatus of this embodiment, byperforming enhancement layer decoder 1004 decoding using parametersdecoded by base layer decoder 1002, it is possible to generate a decodedsignal from coding information of a sound coding unit that performsenhancement layer coding using decoding parameters in base layer coding.

Base layer decoder 1002 will now be described. FIG. 11 is a blockdiagram showing an example of base layer decoder 1002. Base layerdecoder 1002 in FIG. 11 mainly comprises a demultiplexer 1101,excitation generator 1102, and synthesis filter 1103, and performs CELPdecoding processing.

Demultiplexer 1101 separates various parameters from base layer codinginformation output from demultiplexer 1001, and outputs these parametersto excitation generator 1102 and synthesis filter 1103.

Excitation generator 1102 performs adaptive vector, adaptive vectorgain, noise vector, and noise vector gain decoding, generates anexcitation signal using these, and outputs this excitation signal tosynthesis filter 1103. Synthesis filter 1103 generates a synthesizedsignal using the decoded LPC coefficients.

The operation of base layer decoder 1002 in FIG. 11 will now bedescribed. First, demultiplexer 1101 separates various parameters frombase layer coding information.

Next, excitation generator 1102 performs adaptive vector, adaptivevector gain, noise vector, and noise vector gain decoding. Thenexcitation generator 1102 generates excitation vector ex(n) inaccordance with Equation (25) below.ex(n)=β_(q) ·q(n)+γ_(q) ·c(n)  (25)Here, q(n) indicates an adaptive vector, βq adaptive vector gain, c(n) anoise vector, and γq noise vector gain.

Synthesis filter 1103 then generates synthesized signal syn(n) inaccordance with Equation (26) below, using the decoded LPC coefficients.$\begin{matrix}{{{syn}(n)} = {{{ex}(n)} + {\sum\limits_{i = 1}^{NP}\quad{{\alpha_{q}(i)} \cdot {{syn}\left( {n - i} \right)}}}}} & (26)\end{matrix}$Here, aq indicates the decoded LPC coefficients, and NP the order of theLPC coefficients.

Decoded signal syn(n) decoded in this way is output to up-sampler 1003,and a parameter obtained as a result of decoding is output toenhancement layer decoder 1004. The above processing is repeated whilethere is a new input signal. When there is no new input signal,processing is terminated. Depending on the CELP configuration, a mode isalso possible in which a synthesized signal is output after passingthrough a post-filter. The post-filter mentioned here has a function ofpost-processing to make coding distortion less perceptible.

Enhancement layer decoder 1004 will now be described. FIG. 12 is a blockdiagram showing an example of enhancement layer decoder 1004.Enhancement layer decoder 1004 in FIG. 12 mainly comprises ademultiplexer 1201, LPC coefficient decoder 1202, spectral envelopecalculator 1203, vector decoder 1204, Bark scale shape decoder 1205,multiplier 1206, multiplier 1207, power decoder 1208, multiplier 1209,and IMDCT section 1210.

Demultiplexer 1201 separates various parameters from enhancement layercoding information output from demultiplexer 1001. LPC coefficientdecoder 1202 decodes the LPC coefficients using the LPC coefficientsrelated coding information, and outputs the result to spectral envelopecalculator 1203.

Spectral envelope calculator 1203 calculates spectral envelope env(m) inaccordance with Equation (6) using the decoded LPC coefficients, andoutputs spectral envelope env(m) to vector decoder 1204 and multiplier1207.

Vector decoder 1204 determines quantization bit allocation based onspectral envelope env(m) obtained by spectral envelope calculator 1203,and decodes normalized MDCT coefficients X3q(m) from coding informationobtained from demultiplexer 1201 and the aforementioned quantization bitallocation. The quantization bit allocation method is the same as thatused in enhancement layer coding in the coding method of any ofEmbodiments 1 through 4.

Bark scale shape decoder 1205 decodes Bark scale shape Bq(k) based oncoding information obtained from demultiplexer 1201, and outputs theresult to multiplier 1206.

Multiplier 1206 multiplies normalized MDCT coefficients X3q(m) by Barkscale shape Bq(k) in accordance with Equation (27) below, and outputsthe result of the multiplication to multiplier 1207.X2_(q)(m)=X3_(q)(m){square root}{square root over (B _(q) (k))}fl(k)≦m≦fh(k) 0≦k<K  (27)Here, fl(k) indicates the lowest frequency of the k'th sub-band andfh(k) the highest frequency of the k'th sub-band, and K indicates thenumber of sub-bands.

Multiplier 1207 multiplies normalized MDCT coefficients X2q(m) obtainedfrom multiplier 1206 by spectral envelope env(m) obtained by spectralenvelope calculator 1203 in accordance with Equation (28) below, andoutputs the result of the multiplication to multiplier 1209.X1_(q)(m)=X2_(q) (m)env(m)  (28)

Power decoder 1208 decodes power powq based on coding informationobtained from demultiplexer 1201, and outputs the result of the decodingto multiplier 1209.

Multiplier 1209 multiplies normalized MDCT coefficients X1q(m) bydecoded power powq in accordance with Equation (29) below, and outputsthe result of the multiplication to IMDCT section 1210.X _(q)(m)=X1_(q)(m){square root}{square root over (powp)}  (29)

IMDCT section 1210 executes IMDCT (Inverse Modified Discrete CosineTransform) processing on the decoded MDCT coefficients obtained in thisway, overlaps and adds the signal obtained in half the previous frameand half the current frame, and the resultant signal is an outputsignal. The above processing is repeated while there is a new inputsignal. When there is no new input signal, processing is terminated.

Thus, according to a signal processing apparatus of this embodiment, byperforming enhancement layer decoder decoding using parameters decodedby a base layer decoder, it is possible to generate a decoded signalfrom coding information of a coding unit that performs enhancement layercoding using decoding parameters in base layer coding.

Embodiment 6

FIG. 13 is a drawing showing an example of the configuration ofenhancement layer decoder 1004. Parts in FIG. 13 identical to those inFIG. 12 are assigned the same reference numerals as in FIG. 12 anddetailed descriptions thereof are omitted.

Enhancement layer decoder 1004 in FIG. 13 differs from enhancement layerdecoder 1004 in FIG. 12 in being provided with a conversion table 1301,LPC coefficient mapping section 1302, spectral envelope calculator 1303,and transformation section 1304, and performing decoding using the LPCcoefficients decoded by base layer decoder 1002.

Conversion table 1301 stores base layer LPC coefficients and enhancementlayer LPC coefficients with the correspondence therebetween indicated.

LPC coefficient mapping section 1302 references conversion table 1301,converts the base layer LPC coefficients input from base layer decoder1002 to the enhancement layer LPC coefficients, and outputs theenhancement layer LPC coefficients to spectral envelope calculator 1303.

Spectral envelope calculator 1303 obtains a spectral envelope based onthe enhancement layer LPC coefficients, and outputs this spectralenvelope to transformation section 1304. Transformation section 1304transforms the spectral envelope and outputs the result to multiplier1207 and vector decoder 1204. An example of the transformation method isthe method shown in Equation (16) of Embodiment 2.

The operation of enhancement layer decoder 1004 in FIG. 13 will now bedescribed. The base layer LPC coefficients are found for signals insignal band 0 to FL, and does not coincide with the LPC coefficientsused by an enhancement layer signal (signal band 0 to FH). However,there is a strong correlation between the two. Therefore, in LPCcoefficient mapping section 1302, a conversion table 1301 is separatelydesigned in advance, showing the correspondence between LPC coefficientsfor signal band 0 to FL signals and signal band 0 to FH signals, usingthis correlation. This conversion table 1301 is used to find theenhancement layer LPC coefficients from the base layer LPC coefficients.

Details of conversion table 1301 are the same as for conversion table601 in Embodiment 2.

Thus according to a signal processing apparatus of this embodiment, byfinding the enhancement layer LPC coefficients using the LPCcoefficients quantized by a base layer decoder, and calculating aspectral envelope from the enhancement layer LPC coefficients, LPCanalysis and quantization are made unnecessary, and the number ofquantization bits can be reduced.

Embodiment 7

FIG. 14 is a block diagram showing the configuration of the enhancementlayer decoder of a signal processing apparatus according to Embodiment 7of the present invention. Parts in FIG. 14 identical to those in FIG. 12are assigned the same reference numerals as in FIG. 12 and detaileddescriptions thereof are omitted.

Enhancement layer decoder 1004 in FIG. 14 differs from the enhancementlayer decoder in FIG. 12 in being provided with a spectral finestructure calculator 1401, calculating spectral fine structure using apitch period decoded by base layer decoder 1002, employing that spectralfine structure in decoding, and performing sound decoding correspondingto sound coding whereby quantization performance is improved.

Spectral fine structure calculator 1401 calculates the spectral finestructure from pitch period T and pitch gain β decoded by base layerdecoder 1002, and outputs the spectral fine structure to vector decoder1204 and multiplier 1207.

Using pitch period Tq and pitch gain βq, spectral fine structurecalculator 1401 calculates spectral fine structure har(m) in accordancewith Equation (30) below. $\begin{matrix}{{{har}(m)} = {\frac{1}{1 - {\beta_{q} \cdot {\mathbb{e}}^{{- j}\frac{2\quad\pi\quad m\quad T_{q}}{M}}}}}} & (30)\end{matrix}$Here, M indicates the spectral resolution. As Equation (30) is anoscillation filter when the absolute value of βq is greater than orequal to 1, a restriction may also be set so that the possible range ofthe absolute value of βq is less than or equal to a predetermined setvalue less than 1 (for example, 0.8).

The allocation of quantization bits by vector decoder 1204 is alsodetermined using spectral envelope env(m) obtained by spectral envelopecalculator 1203 and spectral fine structure har(m) obtained by spectralfine structure calculator 1401. Then normalized MDCT coefficients X3q(m)is decoded from that quantization bit allocation and coding informationobtained from demultiplexer 1201. Also, normalized MDCT coefficientsX1q(m) is found by multiplying normalized MDCT coefficients X2q(m) byspectral envelope env(m) and spectral fine structure har(m) inaccordance with Equation (31) below.X1_(q)(m)=X2_(q)(m)env(m)har(m)  (31)

Thus, according to a signal processing apparatus of this embodiment, bycalculating a spectral fine structure using a pitch period coded by abase layer coder and decoded by a local decoder, and using that spectralfine structure in spectrum normalization and vector quantization, it ispossible to perform sound decoding corresponding to sound coding wherebyquantization performance is improved.

Embodiment 8

FIG. 15 is a block diagram showing the configuration of the enhancementlayer decoder of a signal processing apparatus according to Embodiment 8of the present invention. Parts in FIG. 15 identical to those in FIG. 12are assigned the same reference numerals as in FIG. 12 and detaileddescriptions thereof are omitted.

Enhancement layer decoder 1004 in FIG. 15 differs from the enhancementlayer decoder in FIG. 12 in being provided with a power estimation unit1501, power fluctuation amount decoder 1502, and power generator 1503,and in forming a decoder corresponding to a coder that predicts MDCTcoefficient power using a base layer decoded signal, and encodes theamount of fluctuation from that predicted value.

In FIG. 10 a decoded parameter is output from base layer decoder 1002 toenhancement layer decoder 1004, but in this embodiment a decoded signalobtained by base layer decoder 1002 is output to enhancement layerdecoder 1004 instead of a decoded parameter.

Power estimation unit 1501 estimates the power of the MDCT coefficientsfrom decoded signal sl (n) decoded by base layer decoder 1002, usingEquation (20) or Equation (21).

Power fluctuation amount decoder 1502 decodes the power fluctuationamount from coding information obtained from demultiplexer 1201, andoutputs this to power generator 1503. Power generator 1503 calculatespower from the power fluctuation amount.

Multiplier 1209 finds the MDCT coefficients in accordance with Equation(32) below.X _(q)(m)=X1_(q)(m){square root}{square root over (rq·powp)}  (32)Here, rq indicates the power fluctuation amount, and powp the powerestimate. X1q(m) indicates the output signal from multiplier 1207.

Thus, according to a signal processing apparatus of this embodiment, byconfiguring a decoder corresponding to a coder that predicts MDCTcoefficient power using a base layer decoded signal and encodes theamount of fluctuation from that predicted value, it is possible toreduce the number of bits necessary for MDCT coefficient powerquantization.

Embodiment 9

FIG. 16 is a block diagram showing the configuration of a sound codingapparatus according to Embodiment 9 of the present invention. Soundcoding apparatus 1600 in FIG. 16 mainly comprises a down-sampler 1601,base layer coder 1602, local decoder 1603, up-sampler 1604, delayer1605, subtracter 1606, frequency determination section 1607, enhancementlayer coder 1608, and multiplexer 1609.

In FIG. 16, down-sampler 1601 receives sampling rate FH input data(acoustic data), converts this input data to sampling rate FL lower thansampling rate FH, and outputs the result to base layer coder 1602.

Base layer coder 1602 encodes the sampling rate FL input data inpredetermined basic frame units, and outputs the first codinginformation to local decoder 1603 and multiplexer 1609. Base layer coder1602 may code input data using the CELP method, for example.

Local decoder 1603 decodes the first coding information, and outputs thedecoded signal obtained by decoding to up-sampler 1604. Up-sampler 1604raises the decoded signal sampling rate to FH, and outputs the result tosubtracter 1606 and frequency determination section 1607.

Delayer 1605 delays the input signal by a predetermined time, thenoutputs the signal to subtracter 1606. By making this delay time equalto the time delay arising in down-sampler 1601, base layer coder 1602,local decoder 1603, and up-sampler 1604, phase shift is prevented in thefollowing subtraction processing. Subtracter 1606 performs subtractionbetween the input signal and decoded signal, and outputs the result ofthe subtraction to enhancement layer coder 1608 as an error signal.

Frequency determination section 1607 determines an area for which errorsignal coding is performed and an area for which error signal coding isnot performed from the decoded signal for which the sampling rate hasbeen raised to FH, and notifies enhancement layer coder 1608. Forexample, frequency determination section 1607 determines the frequencyfor auditory masking from the decoded signal for which the sampling ratehas been raised to FH, and outputs this to enhancement layer coder 1608.

Enhancement layer coder 1608 converts the error signal to a frequencydomain and generates an error spectrum, and performs error spectrumcoding based on frequency information obtained from frequencydetermination section 1607. Multiplexer 1609 multiplexes codinginformation obtained by coding by base layer coder 1602 and codinginformation obtained by coding by enhancement layer coder 1608.

The signals coded by base layer coder 1602 and enhancement layer coder1608 respectively will now be described. FIG. 17 is a drawing showing anexample of acoustic signal information distribution. In FIG. 17, thevertical axis indicates the amount of information, and the horizontalaxis indicates frequency. FIG. 17 shows how much speech information andbackground music and background noise information contained in the inputsignal are present in which frequency bands.

As shown in FIG. 17, in the case of speech information, there is a largeamount of information in the low frequency region, and the amount ofinformation decreases the higher the frequency region. Conversely, inthe case of background music and background noise information, there iscomparatively little information in the lower region compared withspeech information, and a large amount of information in the higherregion.

Thus, in the base layer, speech signals are coded with high qualityusing CELP, and in the enhancement layer, background music orenvironmental sound that cannot be represented in the base layer, andsignals with higher frequency components than the frequency regioncovered by the base layer, are coded efficiently.

FIG. 18 is a drawing showing an example of coding regions in the baselayer and enhancement layer. In FIG. 18, the vertical axis indicates theamount of information, and the horizontal axis indicates frequency. FIG.18 shows the regions that are the object of information coded by baselayer coder 1602 and enhancement layer coder 1608 respectively.

Base layer coder 1602 is designed to represent efficiently speechinformation in the frequency band from 0 to FL, and can performgood-quality coding of speech information in this region. However, withbase layer coder 1602, the coding quality of background music andbackground noise information in the frequency band from 0 to FL is nothigh.

Enhancement layer coder 1608 is designed to cover portions for which thecapability of base layer coder 1602 is insufficient, as described above,and signals in the frequency band from FL to FH. Thus, by combining baselayer coder 1602 and enhancement layer coder 1608, it is possible toimplement high-quality coding in a wide band.

As shown in FIG. 18, the first coding information obtained by coding inbase layer coder 1602 contains speech information in the frequency bandbetween 0 and FL, and therefore a scalable function can be implementedwhereby a decoded signal can be obtained even with only at least thefirst coding information.

Also, raising coding efficiency by using auditory masking in theenhancement layer can be considered. Auditory masking employs the humanauditory characteristic whereby, when a certain signal is supplied, asignal in the vicinity of the frequency of that signal cannot be heard(is masked).

FIG. 19 is a drawing showing an example of an acoustic (music) signalspectrum. In FIG. 19, the solid line indicates auditory masking, and thedotted line indicates the error spectrum. “Error spectrum” here meansthe spectrum of an error signal (enhancement layer input signal) for aninput signal and base layer decoded signal.

In the error spectrum indicated by shaded areas in FIG. 19, amplitudevalues are lower than the auditory masking, and therefore sound cannotbe heard by the human ear, while in other regions error spectrumamplitude values exceed the auditory masking, and therefore quantizationdistortion is perceived.

In the enhancement layer, it is only necessary to code the errorspectrum included in the white areas in FIG. 19 so that quantizationdistortion of those regions is smaller than the auditory masking.Coefficients belonging to the shaded areas are already smaller than theauditory masking, and so need not be quantized.

In sound coding apparatus 1600 of this embodiment, a frequency at whicha residual error signal is coded according to auditory masking, etc., isnot transmitted from the coding side to the decoding side, and the errorspectrum frequency at which enhancement layer coding is performed isdetermined separately by the coding side and the decoding side using anup-sampled base layer decoded signal.

In the case of a decoded signal resulting from decoding of base layercoding information, the same signal is obtained by the coding side andthe decoding side, and therefore by having the coding side code thesignal by determining the auditory masking frequency from this decodedsignal, and having the decoding side decode the signal by obtainingauditory masking frequency information from this decoded signal, itbecomes unnecessary to code and transmit error spectrum frequencyinformation as additional information, enabling a reduction in the bitrate to be achieved.

Next, the operation of each block of a sound coding apparatus accordingto this embodiment will be described in detail. First, the operation offrequency determination section 1607, which determines an error spectrumfrequency coded in the enhancement layer from an up-sampled base layerdecoded signal (hereinafter referred to as “base layer decoded signal”),will be described. FIG. 20 is a block diagram showing an example of theinternal configuration of the frequency determination section of a soundcoding apparatus of this embodiment.

In FIG. 20, frequency determination section 1607 mainly comprises an FFTsection 1901, estimated auditory masking calculator 1902, anddetermination section 1903.

FFT section 1901 performs orthogonal conversion of base layer decodedsignal x(n) output from up-sampler 1604, calculates amplitude spectrumP(m), and outputs amplitude spectrum P(m) to estimated auditory maskingcalculator 1902 and determination section 1903. To be specific, FFTsection 1901 calculates amplitude spectrum P(m) using Equation (33)below.P(m)={square root}{square root over (Re ²(m)+Im ²(m))}  (33)

Here, Re(m) and Im(m) indicate the real part and imaginary part ofFourier coefficients of base layer decoded signal x(n), and m indicatesfrequency.

Next, estimated auditory masking calculator 1902 calculates estimatedauditory masking M′(m) using base layer decoded signal amplitudespectrum P(m), and outputs estimated auditory masking M′(m) todetermination section 1903. Auditory masking is generally calculatedbased on the spectrum of an input signal, but in this implementationexample, auditory masking is estimated using base layer decoded signalx(n) instead of the input signal. This is based on the idea that, sincebase layer decoded signal x(n) is determined so that there is littledistortion with respect to the input signal, adequate approximation willbe achieved and there will be no major problem if base layer decodedsignal x(n) is used instead of the input signal.

Determination section 1903 then determines a frequency for which errorspectrum coding by enhancement layer coder 1608 is applicable, usingbase layer decoded signal amplitude spectrum P(m) and estimated auditorymasking M′(m) obtained by estimated auditory masking calculator 1902.Determination section 1903 regards base layer decoded signal amplitudespectrum P(m) as an approximation of the error spectrum, and outputsfrequency m for which Equation (34) below holds true to enhancementlayer coder 1608.P(m)−M′(m)>0  (34)

In Equation (34), term P(m) estimates the size of the error spectrum,and term M′(m) estimates auditory masking. Determination section 1903then compares the value of the estimated error spectrum and estimatedauditory masking, and if Equation (34) is satisfied—that is to say, ifthe value of the estimated error spectrum exceeds the value of theestimated auditory masking—the error spectrum of that frequency isassumed to be perceived as noise, and is made subject to coding byenhancement layer coder 1608.

Conversely, if the value of the estimated error spectrum is smaller thanthe size of the estimated auditory masking, determination section 1903considers that the error spectrum of that frequency will not beperceived as noise due to the effects of masking, and determines theerror spectrum of this frequency not to be subject to quantization.

The operation of estimated auditory masking calculator 1902 will now bedescribed. FIG. 21 is a drawing showing an example of the internalconfiguration of the auditory masking calculator of a sound codingapparatus of this embodiment. In FIG. 21, estimated auditory maskingcalculator 1902 mainly comprises a Bark spectrum calculator 2001, spreadfunction convolution unit 2002, tonality calculator 2003, and auditorymasking calculator 2004.

In FIG. 21, Bark spectrum calculator 2001 calculates Bark spectrum B(k)using Equation (35) below. $\begin{matrix}{{B(k)} = {\sum\limits_{m = {f\quad l\quad{(k)}}}^{f\quad{h{(k)}}}{P^{2}(m)}}} & (35)\end{matrix}$Here, P(m) indicates an amplitude spectrum, and is found from Equation(33) above, k corresponds to the Bark spectrum number, and fl(k) andfh(k) indicates the lowest frequency and highest frequency respectivelyof the k'th Bark spectrum. Bark spectrum B(k) indicates the spectralintensity in the case of band distribution at equal intervals on theBark scale. If the Herz scale is represented by h and the Bark scale byB, the relationship between the Herz scale and Bark scale is expressedby Equation (36) below. $\begin{matrix}{B = {{13\quad{\tan^{- 1}\left( {0.76\quad f} \right)}} + {3.5\quad{\tan^{- 1}\left( \frac{f}{7.5} \right)}}}} & (36)\end{matrix}$

Spread function convolution unit 2002 convolutes spread function SF(k)to Bark spectrum B(k) using Equation (37) below.C(k)=B(k)*SF(k)  (37)

Tonality calculator 2003 finds spectrum flatness SFM(k) of each Barkspectrum using Equation (38) below. $\begin{matrix}{{{SFM}(k)} = \frac{{\mu g}(k)}{\mu\quad{a(k)}}} & (3.8)\end{matrix}$Here, μg(k) indicates the geometric mean of power spectra in the k'thBark spectrum, and μa(k) indicates the arithmetic mean of power spectrain the k'th Bark spectrum. Tonality calculator 2003 then calculatestonality coefficient α(k) from decibel value SFMdB (k) of spectrumflatness SFM(k), using Equation (39) below. $\begin{matrix}{{\alpha(k)} = {\min\left( {\frac{{SFMdB}(k)}{- 60},1.0} \right)}} & (39)\end{matrix}$

Using Equation (40) below, auditory masking calculator 2004 finds offsetO(k) of each Bark scale from tonality coefficient α(k) calculated bytonality calculator 2003.O(k)=α(k)·(14.5−k)+(1.0−α(k)) ·5.5  (40)

Auditory masking calculator 2004 then uses Equation (41) below tocalculate auditory masking T(k) by subtracting off set O(k) from C(k)found by spread function convolution unit 2002.T(k)=max(10^(log) ¹⁰ ^((C(k))−(O(k)/10)) , T _(q)(k))  (41)

Here, Tq(k) indicates an absolute threshold value. The absolutethreshold value represents the minimum value of auditory maskingobserved as a human auditory characteristic. Then auditory maskingcalculator 2004 converts auditory masking T(k) expressed on the Barkscale to the Herz scale and finds estimated auditory masking M′(m),which it outputs to determination section 1903.

Enhancement layer coder 1608 performs MDCT coefficient coding usingfrequency m subject to quantization found in this way. FIG. 22 is ablock diagram showing an example of the internal configuration of anenhancement layer coder of this embodiment. Enhancement layer coder 1608in FIG. 22 mainly comprises an MDCT section 2101 and MDCT coefficientquantizer 2102.

MDCT section 2101 multiplies the input signal output from subtracter1606 by an analysis window, then performs MDCT (Modified Discrete CosineTransform) processing to obtain the MDCT coefficients. In MDCTprocessing, an orthogonal base for analysis is used for successive twoframes. And the analysis frame is overlapped one-half, and the firsthalf of the analysis frame is an odd function while the latter half ofthe analysis frame is an even function. A feature of MDCT processing isthat frame boundary distortion does not occur because of addition byoverlapping of waveforms after an inverse transform. When MDCT isperformed, the input signal is multiplied by a window function such as asin window. If a sequence of MDCT coefficients is designated X(n), theMDCT coefficients are calculated in accordance with Equation (42) below.$\begin{matrix}{{X(m)} = {\sqrt{\frac{1}{N}}{\sum\limits_{n = 0}^{{2N} - 1}\quad{{x(n)}\cos\left\{ \frac{{\left( {{2n} + 1 + N} \right) \cdot \left( {{2m} + 1} \right)}\pi}{4N} \right\}}}}} & (42)\end{matrix}$

MDCT coefficient quantizer 2102 quantizes the coefficients correspondingto frequencies from frequency determination section 1607. Then MDCTcoefficient quantizer 2102 outputs the quantized MDCT coefficientscoding information to multiplexer 1609.

Thus, according to a sound coding apparatus of this embodiment, becauseof determining frequencies for quantization in enhancement layer byusing a base layer decoded signal, it is unnecessary to transmitfrequency information for quantization from the coding side to thedecoding side, and enabling high-quality coding to be performed at a lowbit rate.

In the above embodiment, an auditory masking calculation method thatuses FFT has been described, but it is also possible to calculateauditory masking using MDCT instead of FFT. FIG. 23 is a block diagramshowing an example of the internal configuration of an auditory maskingcalculator of this embodiment. Parts in FIG. 23 identical to those inFIG. 20 are assigned the same reference numerals as in FIG. 20 anddetailed descriptions thereof are omitted.

MDCT section 2201 approximates amplitude spectrum P(m) using the MDCTcoefficients. To be specific, MDCT section 2201 approximates P(m) usingEquation (43) below.P(m)={square root}{square root over (R ²(m))}  (43)Here, R(m) is the MDCT coefficients found by performing MDCT processingon a signal supplied from up-sampler 1604.

Estimated auditory masking calculator 1902 calculates Bark spectrum B(k)from P(m) approximately. Thereafter, frequency information forquantization is calculated in accordance with the above-describedmethod.

Thus, a sound coding apparatus of this embodiment can calculate auditorymasking using MDCT.

The decoding side will now be described. FIG. 24 is a block diagramshowing the configuration of a sound decoding apparatus according toEmbodiment 9 of the present invention. Sound decoding apparatus 2300 inFIG. 24 mainly comprises a demultiplexer 2301, base layer decoder 2302,up-sampler 2303, frequency determination section 2304, enhancement layerdecoder 2305, and adder 2306.

Demultiplexer 2301 separates code coded by sound coding apparatus 1600into base layer first coding information and enhancement layer secondcoding information, outputs the first coding information to base layerdecoder 2302, and outputs the second coding information to enhancementlayer decoder 2305.

Base layer decoder 2302 decodes the first coding information and obtainsa sampling rate FL decoded signal. Then base layer decoder 2302 outputsthe decoded signal to up-sampler 2303. Up-sampler 2303 converts thesampling rate FL decoded signal to a sampling rate FH decoded signal,and outputs this signal to frequency determination section 2304 andadder 2306.

Using the up-sampled base layer decoded signal, frequency determinationsection 2304 determines error spectrum frequencies to be decoded inenhancement layer decoder 2305. This frequency determination section2304 has the same kind of configuration as frequency determinationsection 1607 in FIG. 16.

Enhancement layer decoder 2305 decodes the second coding information andoutputs the sampling rate of FH decoded signal to adder 2306.

Adder 2306 adds the base layer decoded signal up-sampled by up-sampler2303 and the enhancement layer decoded signal decoded by enhancementlayer decoder 2305, and outputs the resulting signal.

Next, the operation of each block of a sound decoding apparatusaccording to this embodiment will be described in detail. FIG. 25 is ablock diagram showing an example of the internal configuration of theenhancement layer decoder of a sound decoding apparatus of thisembodiment. FIG. 25 shows an example of the internal configuration ofenhancement layer decoder 2305 in FIG. 24. Enhancement layer decoder2305 in FIG. 25 mainly comprises an MDCT coefficient decoder 2401, IMDCTsection 2402, and overlap adder 2403.

MDCT coefficient decoder 2401 decodes the MDCT coefficients quantizedfrom second coding information output from demultiplexer 2301 based onfrequencies outputted from frequency determination section 2304. To bespecific, the decoded MDCT coefficients corresponding to the frequenciesindicated by frequency determination section 2304 are positioned, andzero is supplied for other frequencies.

IMDCT section 2402 executes inverse MDCT processing on the MDCTcoefficients output from MDCT coefficient decoder 2401, generates a timedomain signal, and outputs this signal to overlap adder 2403.

Overlap adder 2403 performs overlap and add operation after windowingwith a time domain signal from IMDCT section 2042, and it outputs thedecoded signal to adder 2306. To be specific, overlap adder 2403multiplies the decoded signal by a window and overlaps the time domainsignal decoded in the previous frame and the current frame, performingaddition, and generates an output signal.

Thus, according to a sound decoding apparatus of this embodiment, bydetermining the frequencies for enhancement layer's decoding by usingbase layer decoded signal, it is possible to determine the frequenciesfor enhancement layer's decoding without any additional information, andenabling high-quality coding to be performed at a low bit rate.

Embodiment 10

In this embodiment an example is described in which CELP is used in baselayer coding. FIG. 26 is a block diagram showing an example of theinternal configuration of a base layer coder of Embodiment 10 of thepresent invention. FIG. 26 shows an example of the internalconfiguration of base layer coder 1602 in FIG. 16. Base layer coder 1602in FIG. 16 mainly comprises an LPC analyzer 2501, weighting section2502, adaptive code book search unit 2503, adaptive gain quantizer 2504,target vector generator 2505, noise code book search unit 2506, noisegain quantizer 2507, and multiplexer 2508.

LPC analyzer 2501 calculates the LPC coefficients of a sampling rate FLinput signal, converts the LPC coefficients to a parameter suitable forquantization such as the LSP coefficients, and performs quantization.LPC analyzer 2501 then outputs the coding information obtained by thisquantization to multiplexer 2508.

Also, LPC analyzer 2501 calculates the quantized LSP coefficients fromcoding information and converts this to the LPC coefficients, andoutputs the quantized LPC coefficients to adaptive code book search unit2503, adaptive gain quantizer 2504, noise code book search unit 2506,and noise gain quantizer 2507. LPC analyzer 2501 also outputs theoriginal LPC coefficients to weighting section 2502, adaptive code booksearch unit 2503, adaptive gain quantizer 2504, noise code book searchunit 2506, and noise gain quantizer 2507.

Weighting section 2502 performs weighting on the input signal outputfrom down-sampler 1601 based on the LPC coefficients obtained by LPCanalyzer 1501. The purpose of this is to perform spectrum shaping sothat the quantization distortion spectrum is masked by the input signalspectral envelope.

The adaptive code book is then searched by adaptive code book searchunit 2503 with the weighted input signal as the target signal. A signalin which a previously determined excitation signal is repeated on apitch period basis is called an adaptive vector, and an adaptive codebook is composed of adaptive vectors generated at pitch periods of apredetermined range.

If a weighted input signal is designated t (n), and a signal in which animpulse response of a weighted synthesis filter comprising the originalLPC coefficients and the quantized LPC coefficients is convoluted to theadaptive vector of pitch period i is designated pi (n), then adaptivecode book search unit 2503 outputs pitch period i of the adaptive vectorfor which evaluation function D of Equation (44) below is minimized tomultiplexer 2508 as coding information. $\begin{matrix}{D = {{\sum\limits_{n = 0}^{N - 1}\quad{t^{2}(n)}} - \frac{\left( {\sum\limits_{n = 0}^{N - 1}\quad{{t(n)}{p_{i}(n)}}} \right)^{2}}{\sum\limits_{n = 0}^{N - 1}{p_{i}^{2}(n)}}}} & (44)\end{matrix}$Here, N indicates the vector length. As the first term of Equation (44)is independent of pitch period i, adaptive code book search unit 2503actually calculates only the second term.

Adaptive gain quantizer 2504 performs quantization of the adaptive gainthat is multiplied by the adaptive vector. Adaptive gain β is expressedby Equation (45) below. Adaptive gain quantizer 2504 performs scalarquantization of this adaptive gain β, and outputs the coding informationobtained in quantization to multiplexer 2508. $\begin{matrix}{\beta = \frac{\sum\limits_{n = 0}^{N - 1}\quad{{t(n)}{p_{i}(n)}}}{\sum\limits_{n = 0}^{N - 1}{p_{i}^{2}(n)}}} & (45)\end{matrix}$

Target vector generator 2505 subtracts the effect of the adaptive vectorfrom the input signal, and generates and outputs the target vector usedby noise code book search unit 2506 and noise gain quantizer 2507. Intarget vector generator 2505, if pi(n) designates a signal in which aweighted synthesis filter impulse response is convoluted to the adaptivevector when evaluation function D expressed by Equation (44) isminimized, and βq designates the quantized adaptive gain when adaptivegain β expressed by Equation (45) undergoes scalar quantization, thentarget vector t2(n) is expressed by Equation (46) below.t ₂(n)=t(n)−βq·p _(i)(n)  (46)

Noise code book search unit 2506 carries out a noise code book searchusing the aforementioned target vector t2 (n), the original LPCcoefficients, and the quantized LPC coefficients. Noise code book searchunit 2506 can use random noise or a signal learned using a large-amountspeech signal, for example. Also, an algebraic code book can be used.The algebraic codebook consists of some of pulses. A feature of such analgebraic code book is that an optimal combination of pulse position andpulse code (polarity) can be determined by a small amount ofcomputation.

If the target vector is designated t2(n), and a signal in which animpulse response of a weighted synthesis filter is convoluted to thenoise vector corresponding to code j is designated cj(n), then noisecode book search unit 2506 outputs to multiplexer 2508 index j of thenoise vector for which evaluation function D of Equation (47) below isminimized. $\begin{matrix}{D = {{\sum\limits_{n = 0}^{N - 1}\quad{t_{2}^{2}(n)}} - \frac{\left( {\sum\limits_{n = 0}^{N - 1}\quad{{t_{2}(n)}{c_{j}(n)}}} \right)^{2}}{\sum\limits_{n = 0}^{N - 1}\quad{c_{j}^{2}(n)}}}} & (47)\end{matrix}$

Noise gain quantizer 2507 quantizes the noise gain that is multiplied bythe noise vector. Noise gain quantizer 2507 calculates adaptive gain γusing Equation (48) below, performs scalar quantization of this noisegain y, and outputs the coding information to multiplexer 2508.$\begin{matrix}{\gamma = {\sum\limits_{n = 0}^{N - 1}\frac{\quad{{t_{2}(n)}{c_{j}(n)}}}{\sum\limits_{n = 0}^{N - 1}\quad{c_{j}^{2}(n)}}}} & (48)\end{matrix}$

Multiplexer 2508 multiplexes the coding information of the LPCcoefficients, adaptive vector, adaptive gain, noise vector, and noisegain coding information, and outputs the resultant information to localdecoder 1603 and multiplexer 1609.

The decoding side will now be described. FIG. 27 is a block diagramshowing an example of the internal configuration of a base layer decoderof this embodiment. FIG. 27 shows an example of base layer decoder 2302.Base layer decoder 2302 in FIG. 27 mainly comprises a demultiplexer2601, excitation generator 2602, and synthesis filter 2603.

Demultiplexer 2601 separates first coding information from demultiplexer2301 into LPC coefficients, adaptive vector, adaptive gain, noisevector, and noise gain coding information, and outputs the adaptivevector, adaptive gain, noise vector, and noise gain coding informationto excitation generator 2602. Similarly, demultiplexer 2601 outputslinear predictive coefficients coding information to synthesis filter2603.

Excitation generator 2602 decodes adaptive vector, adaptive vector gain,noise vector, and noise vector gain coding information, and generatesexcitation vector ex(n) using Equation (49) below.ex(n)=β_(q) ·q(n)−γ_(q) c(n)  (49)Here, q(n) indicates an adaptive vector, βq adaptive vector gain, c(n) anoise vector, and γq noise vector gain.

Synthesis filter 2603 performs LPC coefficient decoding from LPCcoefficient coding information, and generates synthesized signal syn(n)from the decoded LPC coefficients using Equation (50) below.$\begin{matrix}{{{syn}(n)} = {{{ex}(n)} + {\sum\limits_{i = 1}^{NP}\quad{{\alpha_{q}(i)} \cdot {{syn}\left( {n - i} \right)}}}}} & (50)\end{matrix}$Here, aq indicates the decoded LPC coefficients, and NP the order of theLPC coefficients. Synthesis filter 2603 then outputs decoded signalsyn(n) decoded in this way to up-sampler 2303.

Thus, according to a sound coding apparatus of this embodiment, bycoding an input signal using CELP in the base layer on the transmittingside, and decoding this coded input signal using CELP on the receivingside, it is possible to implement a high-quality base layer at a low bitrate.

In order to suppress perception of quantization distortion, a codingapparatus of this embodiment can also employ a configuration withsubordinate connection of a post-filter after synthesis filter 2603.FIG. 28 is a block diagram showing an example of the internalconfiguration of a base layer decoder of this embodiment. Parts in FIG.28 identical to those in FIG. 27 are assigned the same referencenumerals as in FIG. 27 and detailed descriptions thereof are omitted.

Various kinds of configuration may be employed for post-filter 2701 toachieve suppression of perception of quantization distortion, onetypical method being that of using a formant emphasis filter comprisingthe LPC coefficients obtained by decoding by demultiplexer 2601. Formantemphasis filter Hf(z) is expressed by Equation (51) below.$\begin{matrix}{{H_{f}(z)} = {\frac{A\left( {z/\gamma_{n}} \right)}{A\left( {z/\gamma_{d}} \right)}\left( {1 - {\mu z}^{- 1}} \right)}} & (51)\end{matrix}$Here, A(z) indicates an analysis filter comprising the decoded LPCcoefficients, and γn, γd, and μ indicate constants that determine filtercharacteristics.

Embodiment 11

FIG. 29 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus according to Embodiment 11 of the present invention. Parts inFIG. 29 identical to those in FIG. 20 are assigned the same referencenumerals as in FIG. 20 and detailed descriptions thereof are omitted.Frequency determination section 1607 in FIG. 29 differs from that inFIG. 20 in being provided with an estimated error spectrum calculator2801 and determination section 2802, and in estimating estimated errorspectrum E′(m) from base layer decoded signal amplitude spectrum P(m),and determining a frequency of an error spectrum coded by enhancementlayer coder 1608 using estimated error spectrum E′(m) and estimatedauditory masking M′(m).

FFT section 1901 performs Fourier transform of base layer decoded signalx(n) output from up-sampler 1604, calculates amplitude spectrum P(m),and outputs amplitude spectrum P(m) to estimated auditory maskingcalculator 1902 and estimated error spectrum calculator 2801.

Estimated error spectrum calculator 2801 calculates estimated errorspectrum E′(m) from base layer decoded signal amplitude spectrum P(m)calculated by FFT section 1901, and outputs estimated error spectrumE′(m) to determination section 2802. Estimated error spectrum E′(m) iscalculated by executing processing that approximates base layer decodedsignal amplitude spectrum P(m) to flatness. To be specific, estimatederror spectrum calculator 2801 calculates estimated error spectrum E′(m)using Equation (52) below.E′(m)=a·P(m)^(y)  (52)Here, a and γ are constants of 0 or above and less than 1.

Using estimated error spectrum E′(m) obtained by estimated errorspectrum calculator 2801 and estimated auditory masking M′(m) obtainedby estimated auditory masking calculator 1902, determination section2802 determines frequencies for error spectrum coding by enhancementlayer coder 1608.

Next, an estimated error spectrum calculated by estimated error spectrumcalculator 2801 of this embodiment will be described. FIG. 30 is adrawing showing an example of a residual error spectrum calculated by anestimated error spectrum calculator of this embodiment.

As shown in FIG. 30, the spectrum shape of error spectrum E(m) issmoother than that of base layer decoded signal amplitude spectrum P(m),and its total band power is smaller. Therefore, the precision of errorspectrum estimation can be improved by flattening the amplitude spectrumP(m) to the power of γ(0<γ<1), and reducing total band power bymultiplying by a (0<a<1).

On the decoding side also, the internal configuration of frequencydetermination section 2304 of sound decoding apparatus 2300 is the sameas that of coding-side frequency determination section 1607 in FIG. 29.

Thus, according to a sound coding apparatus of this embodiment, bysmoothing a residual error spectrum estimated from a base layer decodedsignal spectrum, the estimated error spectrum can be approximated to theresidual error spectrum, and an error spectrum can be coded efficientlyin the enhancement layer.

In this embodiment a case has been described in which FFT is used, but aconfiguration is also possible in which MDCT or other transformation isused instead of FFT, as in above-described Embodiment 9.

Embodiment 12

FIG. 31 is a block diagram showing an example of the internalconfiguration of the frequency determination section of a sound codingapparatus according to Embodiment 12 of the present invention. Parts inFIG. 31 identical to those in FIG. 20 are assigned the same referencenumerals as in FIG. 20 and detailed descriptions thereof are omitted.Frequency determination section 1607 in FIG. 31 differs from that inFIG. 20 in being provided with an estimated auditory masking correctionsection 3001 and determination section 3002, and in that frequencydetermination section 1607, after calculating estimated auditory maskingM′(m) by means of estimated auditory masking calculator 1902 from baselayer decoded signal amplitude spectrum P(m), applies correction to thisestimated auditory masking M′(m) based on local decoder 1603 decodedparameter information.

FFT section 1901 performs Fourier transform of base layer decoded signalx(n) output from up-sampler 1604, calculates amplitude spectrum P(m),and outputs amplitude spectrum P(m) to estimated auditory maskingcalculator 1902 and determination section 3002. Estimated auditorymasking calculator 1902 calculates estimated auditory masking M′(m)using base layer decoded signal amplitude spectrum P(m), and outputsestimated auditory masking M′(m) to estimated auditory maskingcorrection section 3001.

Using base layer decoded parameter information input from local decoder1603, estimated auditory masking correction section 3001 appliescorrection to estimated auditory masking M′(m) obtained by estimatedauditory masking calculator 1902.

It is here assumed that a first order PARCOR coefficient calculated fromthe decoded LPC coefficients is supplied as base layer codinginformation. Generally, the LPC coefficients and PARCOR coefficientsrepresent an input signal spectral envelope. Due to the properties ofthe PARCOR coefficients, as the order of the PARCOR coefficients islowered, the shape of a spectral envelope is simplified, and when theorder of the PARCOR coefficients is 1, the degree of tilt of a spectrumis indicated.

On the other hand, in the spectral characteristics of a audio or speechinput signal, there are cases where power is biased toward the lowerregion as opposed to the higher region (as with vowels, for example),and cases where the converse is true (as with consonants, for example).A base layer decoded signal is susceptible to the influence of suchinput signal spectral characteristics, and there is a tendency forspectrum power bias to be emphasized more than necessary.

Thus, in a sound coding apparatus of this embodiment, the precision ofestimated masking M′(m) can be improved by correcting excessivelyemphasized spectral bias in estimated auditory masking correctionsection 3001 using an aforementioned first order PARCOR coefficient.

Estimated auditory masking correction section 3001 calculates correctionfilter H_(k)(z) from first order PARCOR coefficient k(1) output frombase layer coder 1602, using Equation (53) below.H _(k)(z)=1−β·k(1)·z ⁻¹  (53)Here, β indicates a positive constant less than 1. Next, estimatedauditory masking correction section 3001 calculates amplitudecharacteristic K(m) of correction filter H_(k)(z) using Equation (54)below. $\begin{matrix}{{K(m)} = \left| {1 - {\beta \cdot {k(1)} \cdot e^{{- j}\frac{2{\pi m}}{M}}}} \right|} & (54)\end{matrix}$

Then estimated auditory masking correction section 3001 calculatescorrected estimated auditory masking M″(m) from correction filteramplitude characteristic K(m), using Equation (55) below.M″(m)=K(m)·M′(m)  (55)

Estimated auditory masking correction section 3001 then outputscorrected estimated auditory masking M″(m) to determination section 3002instead of estimated auditory masking M′(m).

Using base layer decoded signal amplitude spectrum P(m), and correctedauditory masking M″(m) output from estimated auditory masking correctionsection 3001, determination section 3002 determines frequencies forerror spectrum coding by enhancement layer coder 1608.

Thus, according to a sound coding apparatus of this embodiment, bycalculating auditory masking from an input signal spectrum using maskingeffect characteristics, and performing quantization so that quantizationdistortion does not exceed the masking value in enhancement layercoding, it is possible to reduce the number of MDCT coefficients subjectto quantization without a degradation of quality, and to performhigh-quality coding at a low bit rate.

Thus, according to a sound coding apparatus of this embodiment, byapplying correction based on base layer coder decoded parameterinformation to estimated auditory masking, it is possible to improve theprecision of estimated auditory masking, and to perform efficient errorspectrum coding in the enhancement layer.

On the decoding side also, the internal configuration of frequencydetermination section 2304 of sound decoding apparatus 2300 is the sameas that of coding-side frequency determination section 1607 in FIG. 31.

It is also possible for frequency determination section 1607 of thisembodiment to employ a configuration combining this embodiment andEmbodiment 11. FIG. 32 is a block diagram showing an example of theinternal configuration of the frequency determination section of a soundcoding apparatus of this embodiment. Parts in FIG. 32 identical to thosein FIG. 20 are assigned the same reference numerals as in FIG. 20 anddetailed descriptions thereof are omitted.

FFT section 1901 performs Fourier transform of base layer decoded signalx(n) output from up-sampler 1604, calculates amplitude spectrum P(m),and outputs amplitude spectrum P(m) to estimated auditory maskingcalculator 1902 and estimated error spectrum calculator 2801.

Estimated auditory masking calculator 1902 calculates estimated auditorymasking M′(m) using base layer decoded signal amplitude spectrum P (m),and outputs estimated auditory masking M′(m) to estimated auditorymasking correction section 3001.

In estimated auditory masking correction section 3001, base layer codedparameter information input from local decoder 1603 applies correctionto estimated auditory masking M′(m) obtained by estimated auditorymasking calculator 1902.

Estimated error spectrum calculator 2801 calculates estimated errorspectrum E′(m) from base layer decoded signal amplitude spectrum P(m)calculated by FFT section 1901, and outputs estimated error spectrumE′(m) to determination section 3101.

Using estimated error spectrum E′(m) estimated by estimated errorspectrum calculator 2801 and corrected auditory masking M″(m) outputfrom estimated auditory masking correction section 3001, determinationsection 3101 determines a frequency subject to error spectrum coding byenhancement layer coder 1608.

In this embodiment a case has been described in which FFT is used, but aconfiguration is also possible in which MDCT or other transformtechnique is used instead of FFT, as in above-described Embodiment 9.

Embodiment 13

FIG. 33 is a block diagram showing an example of the internalconfiguration of the enhancement layer coder of a sound coding apparatusaccording to Embodiment 13 of the present invention. Parts in FIG. 33identical to those in FIG. 22 are assigned the same reference numeralsas in FIG. 22 and detailed descriptions thereof are omitted. Theenhancement layer coder in FIG. 33 differs from the enhancement layercoder in FIG. 22 in being provided with a ordering section 3201 and MDCTcoefficient quantizer 3202, and the weighting is performed by frequencyon a frequency supplied from frequency determination section 1607 inaccordance with the amount of estimated distortion value D(m).

In FIG. 33, MDCT section 2101 multiplies the input signal output fromsubtracter 1606 by an analysis window, then performs MDCT (ModifiedDiscrete Cosine Transform) processing to obtain MDCT coefficients, andoutputs the MDCT coefficients to MDCT coefficient quantizer 3202.

Ordering section 3201 receives frequency information obtained byfrequency determination section 1607, and calculates the amount by whichestimated error spectrum E′(m) of each frequency exceeds estimatedauditory masking M′(m) (hereinafter referred to as the estimateddistortion value), D(m). This estimated distortion value D(m) is definedby Equation (56) below.D(m)=E′(m)−M′(m)  (56)

Here, ordering section 3201 calculates only estimated distortion valuesD(m) that satisfy Equation (57) below.E′(m)−M′(m)>0  (57)

Then ordering section 3201 performs ordering in high-to-low estimateddistortion value D(m) order, and outputs the corresponding frequencyinformation to MDCT coefficient quantizer 3202. MDCT coefficientquantizer 3202 performs quantization, allocating bits proportionally toerror spectra E(m) positioned at frequencies in high-to-low distortionvalue D(m) order based on the estimated distortion value D(m).

As an example, a case will here be described in which frequencies sentfrom the frequency determination section and estimated distortion valuesare as shown in FIG. 34. FIG. 34 is a drawing showing an example ofranking of estimated distortion values by an ordering section of thisembodiment.

Ordering section 3201 rearranges frequencies in high-to-low estimateddistortion value D(m) order based on the information in FIG. 34. In thisexample, the frequency m order obtained as a result of processing byordering section 3201 is: 7, 8, 4, 9, 1, 11, 3, 12. Ordering section3201 outputs this ordering information to MDCT coefficient quantizer3202.

Within error spectrum E(m) given by MDCT section 2101, MDCT coefficientquantizer 3202 quantizes E(7), E(8), E(4), E(9), E(1), E(11), E(3),E(12), based on the ordering information given by ordering section 3201.

At this time, there is allocation of many bits used for error spectrumquantization at the start of the order, and allocation of progressivelyfewer bits toward the end of the order. That is to say, the larger theestimated distortion value D(m) of a frequency, the greater is theallocation of bits used for error spectrum quantization, and the smallerthe estimated distortion value D(m) of a frequency, the smaller is theallocation of bits used for error spectrum quantization.

For example, bit allocation may be executed as follows: 8 bits for E(7),7 bits for E(8) and E(4), 6 bits for E(9) and E(1), and 8 bits forE(11), E(3), and E(12). Performing adaptive bit allocation according toestimated distortion value D(m) in this way improves quantizationefficiency.

When vector quantization is applied, enhancement layer coder 1608configures vectors in order from the error spectrum located at the startof the order, and performs vector quantization for the respectivevectors. At this time, vector configuration and quantization bitallocation are performed so that bit allocation is greater for an errorspectrum located at the start of the order, and smaller for an errorspectrum located at the end of the order. In the example in FIG. 34,three vectors—two-dimensional, two-dimensional, and four-dimensional—are configured, with V1=(E(7), E(8)), V2=(E(4), E(9)), and V3=E(1),E(11), E(3), E(12)), and the bit allocations are 10 bits for V1, 8 bitsfor V2, and 8 bits for V3.

Thus, according to a sound coding apparatus of this embodiment, animprovement in quantization efficiency can be achieved by, inenhancement layer coding, performing coding with a large amount ofinformation allocated to frequencies for which the amount by which theestimated error spectrum exceeds estimated auditory masking is large.

The decoding side will now be described. FIG. 35 is a block diagramshowing an example of the internal configuration of the enhancementlayer decoder of a sound decoding apparatus according to Embodiment 13of the present invention. Parts in FIG. 35 identical to those in FIG. 25are assigned the same reference numerals as in FIG. 25 and detaileddescriptions thereof are omitted. Enhancement layer decoder 2305 in FIG.35 differs from that in FIG. 25 in being provided with an orderingsection 3401 and MDCT coefficient decoder 3402, and in that frequenciessupplied from frequency determination section 2304 are ordered inaccordance with the amount of estimated distortion value D(m).

Ordering section 3401 calculates estimated distortion value D(m) usingEquation (56) above. Ordering section 3401 has the same configuration asabove-described ordering section 3201. By means of this configuration,it is possible to decode coding information of the above-described soundcoding method that enables adaptive bit allocation to be performed andan improvement inquantization efficiency to be achieved.

MDCT coefficient decoder 3402 decodes second coding information outputfrom demultiplexer 2301 using frequency information ordered inaccordance with the amount of estimated distortion value D(m). To bespecific, MDCT coefficient decoder 3402 positions the decoded MDCTcoefficients corresponding to a frequency supplied from frequencydetermination section 2304, and supplies zero for other frequencies.IMDCT section 2402 then executes inverse MDCT processing on the MDCTcoefficients obtained from MDCT coefficient decoder 2401, and generatesa time domain signal.

Overlap adder 2403 multiplies the aforementioned signal by a windowfunction for combining, and overlaps the time domain signal decoded inthe previous frame and the current frame, performing addition, andgenerates an output signal. Overlap adder 2403 outputs this outputsignal to adder 2306.

Thus, according to a sound decoding apparatus of this embodiment, animprovement in quantization efficiency can be achieved by, inenhancement layer coding, performing vector quantization with adaptivebit allocation performed according to the amount by which an estimatederror spectrum exceeds estimated auditory masking.

Embodiment 14

FIG. 36 is a block diagram showing an example of the internalconfiguration of the enhancement layer coder of a sound coding apparatusaccording to Embodiment 14 of the present invention. Parts in FIG. 36identical to those in FIG. 22 are assigned the same reference numeralsas in FIG. 22 and detailed descriptions thereof are omitted. Theenhancement layer coder in FIG. 36 differs from the enhancement layercoder in FIG. 22 in being provided with a fixed band specificationsection 3501 and MDCT coefficient quantizer 3502, and in that the MDCTcoefficients included in a band specified beforehand is quantizedtogether with the frequencies obtained from frequency determinationsection 1607.

In FIG. 36, a band important in terms of auditory perception is setbeforehand in fixed band specification section 3501. It is here assumedthat “m=15, 16” is set for frequencies included in the set band.

MDCT coefficient quantizer 3502 categorizes an input signal intocoefficients to be quantized and coefficients not to be quantized usingauditory masking output from frequency determination section 1607 in aninput signal from MDCT section 2101, and encodes the coefficients to bequantized and also the coefficients in a band set by fixed bandspecification section 3501.

Assuming the relevant frequencies to be as shown in FIG. 34, errorspectra E(1), E(3), E(4), E(7), E(8), E(9), E(11), E(12), and errorspectra E(15), E(16) of frequencies specified by fixed bandspecification section 3501 are quantized by MDCT coefficient quantizer3502.

Thus, according to a sound coding apparatus of this embodiment, byforcibly quantizing a band that is unlikely to be selected as an objectof quantization but that is important from an auditory standpoint, evenif a frequency that should really be selected as an object of coding isnot selected, an error spectrum located at a frequency included in aband that is important from an auditory standpoint is quantized withoutfail, enabling quality to be improved.

The decoding side will now be described. FIG. 37 is a block diagramshowing an example of the internal configuration of the enhancementlayer decoder of a sound decoding apparatus according to Embodiment 14of the present invention. Parts in FIG. 37 identical to those in FIG. 25are assigned the same reference numerals as in FIG. 25 and detaileddescriptions thereof are omitted. The enhancement layer decoder in FIG.37 differs from the enhancement layer decoder in FIG. 25 in beingprovided with a fixed band specification section 3601 and MDCTcoefficient decoder 3602, and in that the MDCT coefficients included ina band specified beforehand is decoded together with a frequencyobtained from frequency determination section 2304.

In FIG. 37, a band important in terms of auditory perception is setbeforehand in fixed band specification section 3601.

MDCT coefficient decoder 3602 decodes an MDCT coefficient quantized fromsecond coding information output from demultiplexer 2301 based on errorspectrum frequencies subject to decoding output from frequencydetermination section 2304. To be specific, MDCT coefficient decoder3602 positions decoded MDCT coefficients corresponding to frequenciesindicated by frequency determination section 2304 and fixed bandspecification section 3601, and supplies zero for other frequencies.

IMDCT section 2402 executes inverse MDCT processing on the MDCTcoefficients output from MDCT coefficient decoder 3602, generates a timedomain signal, and outputs this time domain signal to overlap adder2403.

Thus, according to a sound decoding apparatus of this embodiment, bydecoding the MDCT coefficients included in a band specified beforehand,it is possible to decode a signal in which a band that is unlikely to beselected as an object of quantization but that is important from anauditory standpoint has been forcibly quantized, and even if thefrequencies that should really be selected as an object of coding on thecoding side is not selected, an error spectrum located at thefrequencies included in a band that is important from an auditorystandpoint is quantized without fail, enabling quality to be improved.

It is also possible for an enhancement layer coder and enhancement layerdecoder of this embodiment to employ a configuration combining thisembodiment and Embodiment 13. FIG. 38 is a block diagram showing anexample of the internal configuration of the frequency determinationsection of a sound coding apparatus of this embodiment. Parts in FIG. 38identical to those in FIG. 22 are assigned the same reference numeralsas in FIG. 22 and detailed descriptions thereof are omitted.

In FIG. 38, MDCT section 2101 multiplies the input signal output fromsubtracter 1606 by an analysis window, then performs MDCT (ModifiedDiscrete Cosine Transform) processing to obtain the MDCT coefficients,and outputs the MDCT coefficients to MDCT coefficient quantizer 3701.

Ordering section 3201 receives frequency information obtained byfrequency determination section 1607, and calculates the amount by whichestimated error spectrum E′(m) of each frequency exceeds estimatedauditory masking M′(m) (hereinafter referred to as the estimateddistortion value), D(m).

A band important in terms of auditory perception is set beforehand infixed band specification section 3501.

MDCT coefficient quantizer 3701 performs quantization, allocating bitsproportionally to error spectra E(m) positioned at frequencies inhigh-to-low distortion value D(m) order based on frequency informationordered according to estimated distortion value D(m). MDCT coefficientquantizer 3701 also encodes the coefficients in a band set by fixed bandspecification section 3501.

The decoding side will now be described. FIG. 39 is a block diagramshowing an example of the internal configuration of the enhancementlayer decoder of a sound decoding apparatus according to Embodiment 14of the present invention. Parts in FIG. 39 identical to those in FIG. 25are assigned the same reference numerals as in FIG. 25 and detaileddescriptions thereof are omitted.

In FIG. 39, ordering section 3401 receives frequency informationobtained by frequency determination section 2304, and calculates theamount by which estimated error spectrum E′(m) of each frequency exceedsestimated auditory masking M′(m) (hereinafter referred to as theestimated distortion value), D(m).

Then ordering section 3401 performs ordering in high-to-low estimateddistortion value D(m) order, and outputs the corresponding frequencyinformation to MDCT coefficient decoder 3801. A band important in termsof auditory perception is set beforehand in fixed band specificationsection 3601.

MDCT coefficient decoder 3801 decodes the MDCT coefficients quantizedfrom second coding information output from demultiplexer 2301 based onthe error spectrum frequencies subject to decoding output from orderingsection 3401. To be specific, MDCT coefficient decoder 3801 positionsdecoded MDCT coefficients corresponding to frequencies indicated byordering section 3401 and fixed band specification section 3601, andsupplies zero for other frequencies.

IMDCT section 2402 executes inverse MDCT processing on the MDCTcoefficients output from MDCT coefficient decoder 3801, generates a timedomain signal, and outputs this time domain signal to overlap adder2403.

Embodiment 15

Embodiment 15 of the present invention will now be described withreference to the attached drawings. FIG. 40 is a block diagram showingthe configuration of a communication apparatus according to Embodiment15 of the present invention. A feature of this embodiment is that signalprocessing apparatus 3903 in FIG. 40 is configured as one of the soundcoding apparatuses shown in above-described Embodiment 1 throughEmbodiment 14.

As shown in FIG. 40, a communication apparatus 3900 according toEmbodiment 15 of the present invention comprises an input apparatus3901, A/D conversion apparatus 3902, and signal processing apparatus3903 connected to a network 3904.

A/D conversion apparatus 3902 is connected to an output terminal ofinput apparatus 3901. An input terminal of signal processing apparatus3903 is connected to an output terminal of A/D conversion apparatus3902. An output terminal of signal processing apparatus 3903 isconnected to network 3904.

Input apparatus 3901 converts a sound wave audible to the human ear toan analog signal, which is an electrical signal, and supplies thisanalog signal to A/D conversion apparatus 3902. A/D conversion apparatus3902 converts the analog signal to a digital signal, and supplies thisdigital signal to signal processing apparatus 3903. Signal processingapparatus 3903 encodes the input digital signal and generates code, andoutputs this code to network 3904.

Thus, according to a communication apparatus of this embodiment of thepresent invention, effects such as shown in above-described Embodiments1 through 14 can be obtained in communications, and it is possible toprovide a sound coding apparatus that encodes an acoustic signalefficiently with a small number of bits.

Embodiment 16

Embodiment 16 of the present invention will now be described withreference to the attached drawings. FIG. 41 is a block diagram showingthe configuration of a communication apparatus according to Embodiment16 of the present invention. A feature of this embodiment is that signalprocessing apparatus 4003 in FIG. 41 is configured as one of the sounddecoding apparatuses shown in above-described Embodiment 1 throughEmbodiment 14.

As shown in FIG. 41, a communication apparatus 4000 according toEmbodiment 16 of the present invention comprises a receiving apparatus4002 connected to a network 4001, a signal processing apparatus 4003, aD/A conversion apparatus 4004, and an output apparatus 4005.

Receiving apparatus 4002 is connected to network 4001. An input terminalof signal processing apparatus 4003 is connected to an output terminalof receiving apparatus 4002. An input terminal of D/A conversionapparatus 4004 is connected to an output terminal of signal processingapparatus 4003. An input terminal of output apparatus 4005 is connectedto an output terminal of D/A conversion apparatus 4004.

Receiving apparatus 4002 receives a digital coded acoustic signal fromnetwork 4001, generates a digital received acoustic signal, and suppliesthis received acoustic signal to signal processing apparatus 4003.Signal processing apparatus 4003 receives the received acoustic signalfrom receiving apparatus 4002, performs decoding processing on thisreceived acoustic signal and generates a digital decoded acousticsignal, and supplies this digital decoded acoustic signal to D/Aconversion apparatus 4004. D/A conversion apparatus 4004 converts thedigital decoded speech signal from signal processing apparatus 4003 andgenerates an analog decoded speech signal, and supplies this analogdecoded speech signal to output apparatus 4005. Output apparatus 4005converts the analog decoded speech signal, which is an electricalsignal, to air vibrations, and outputs these air vibrations so as to beaudible to the human ear as a sound wave.

Thus, according to a communication apparatus of this embodiment, effectssuch as shown in above-described Embodiments 1 through 14 can beobtained in communications, and it is possible to decode an acousticsignal coded efficiently with a small number of bits, enabling a goodacoustic signal to be output.

Embodiment 17

Embodiment 17 of the present invention will now be described withreference to the attached drawings. FIG. 42 is a block diagram showingthe configuration of a communication apparatus according to Embodiment17 of the present invention. A feature of this embodiment is that signalprocessing apparatus 4103 in FIG. 42 is configured as one of the soundcoding apparatuses shown in above-described Embodiment 1 throughEmbodiment 14.

As shown in FIG. 42, a communication apparatus 4100 according toEmbodiment 17 of the present invention comprises an input apparatus4101, A/D conversion apparatus 4102, signal processing apparatus 4103,RF modulation apparatus 4104, and antenna 4105.

Input apparatus 4101 converts a sound wave audible to the human ear toan analog signal, which is an electrical signal, and supplies thisanalog signal to A/D conversion apparatus 4102. A/D conversion apparatus4102 converts the analog signal to a digital signal, and supplies thisdigital signal to signal processing apparatus 4103. Signal processingapparatus 4103 encodes the input digital signal and generates a codedacoustic signal, and supplies this coded acoustic signal to RFmodulation apparatus 4104. RF modulation apparatus 4104 modulates thecoded acoustic signal and generates a modulated coded acoustic signal,and supplies this modulated coded acoustic signal to antenna 4105.Antenna 4105 transmits the modulated coded acoustic signal as a radiowave.

Thus, according to a communication apparatus of this embodiment, effectssuch as shown in above-described Embodiments 1 through 14 can beobtained in radio communications, and it is possible to code an acousticsignal efficiently with a small number of bits.

The present invention can be applied to a transmitting apparatus,transmit coding apparatus, or acoustic signal coding apparatus that usesaudio signals. The present invention can also be applied to a mobilestation apparatus or base station apparatus.

Embodiment 18

Embodiment 18 of the present invention will now be described withreference to the attached drawings. FIG. 43 is a block diagram showingthe configuration of a communication apparatus according to Embodiment18 of the present invention. A feature of this embodiment is that signalprocessing apparatus 4203 in FIG. 43 is configured as one of the sounddecoding apparatuses shown in above-described Embodiment 1 throughEmbodiment 14.

As shown in FIG. 43, a communication apparatus 4200 according toEmbodiment 18 of the present invention comprises an antenna 4201, RFdemodulation apparatus 4202, signal processing apparatus 4203, D/Aconversion apparatus 4204, and output apparatus 4205.

Antenna 4201 receives a digital coded acoustic signal as a radio wave,generates a digital received coded acoustic signal, which is anelectrical signal, and supplies this digital received coded acousticsignal to RF demodulation apparatus 4202. RF demodulation apparatus 4202demodulates the received coded acoustic signal from antenna 4201 andgenerates a demodulated coded acoustic signal, and supplies thisdemodulated coded acoustic signal to signal processing apparatus 4203.

Signal processing apparatus 4203 receives the digital demodulated codedacoustic signal from RF demodulation apparatus 4202, performs decodingprocessing and generates a digital decoded acoustic signal, and suppliesthis digital decoded acoustic signal to D/A conversion apparatus 4204.D/A conversion apparatus 4204 converts the digital decoded speech signalfrom signal processing apparatus 4203 and generates an analog decodedspeech signal, and supplies this analog decoded speech signal to outputapparatus 4205. Output apparatus 4205 converts the analog decoded speechsignal, which is an electrical signal, to air vibrations, and outputsthese air vibrations so as to be audible to the human ear as a soundwave.

Thus, according to a communication apparatus of this embodiment, effectssuch as shown in above-described Embodiments 1 through 14 can beobtained in radio communications, and it is possible to decode anacoustic signal coded efficiently with a small number of bits, enablinga good acoustic signal to be output.

The present invention can be applied to a receiving apparatus, receivedecoding apparatus, or speech signal decoding apparatus that uses audiosignals. The present invention can also be applied to a mobile stationapparatus or base station apparatus.

The present invention is not limited to the above-described embodiments,and various variations and modifications may be possible withoutdeparting from the scope of the present invention. For example, in theabove embodiments a case has been described in which the presentinvention is implemented as a signal processing apparatus, but thepresent invention is not limited to this, and this signal processingmethod can also be implemented as software.

For example, it is also possible for a program that executes theabove-described signal processing method to be stored in ROM (Read OnlyMemory) beforehand, and forth is program to be operated by a CPU(Central Processing Unit).

It is also possible for a program that executes the above-describedsignal processing method to be stored in a computer-readable storagemedium, for the program stored in the storage medium to be recorded inRAM (Random Access Memory) of a computer, and for the computer to beoperated in accordance with that program.

In the above description, a case has been described in which MDCT isused as a method of transformation from the time domain to the frequencydomain, but the present invention is not limited to this, and anytransformation method can be applied as long as it is an orthogonaltransformation method. For example, a discrete Fourier transform,discrete cosine transform or wavelet transform method can also beapplied.

The present invention can be applied to a receiving apparatus, receivedecoding apparatus, or speech signal decoding apparatus that uses audiosignals. The present invention can also be applied to a mobile stationapparatus or base station apparatus.

As is clear from the above description, according to a coding apparatus,decoding apparatus, coding method, and decoding method of the presentinvention, by performing enhancement layer coding using informationobtained from base layer coding information, it is possible to performhigh-quality coding at a low bit rate even in the case of a signal inwhich speech is predominant and music or environmental sound issuperimposed in the background.

This application is based on Japanese Patent Application No.2002-127541filed on Apr. 26, 2002, and Japanese Patent Application No.2002-267436filed on Sep. 12, 2002, entire content of which is expresslyincorporated by reference herein.

Industrial Applicability

The present invention is suitable for use in apparatuses that code anddecode speech signals, and communication apparatuses.

[FIG. 1]

ACOUSTIC DATA (INPUT SIGNAL)

-   101 DOWN-SAMPLER-   102 BASE LAYER CODER-   103 LOCAL DECODER-   104 UP-SAMPLER-   105 DELAYER-   107 ENHANCEMENT LAYER CODER-   108 MULTIPLEXER    CODED DATA (CODED SIGNAL)    [FIG. 2]    AMOUNT OF INFORMATION    BACKGROUND MUSIC AND BACKGROUND NOISE INFORMATION    VOICE INFORMATION    FREQUENCY    [FIG. 3]    AMOUNT OF INFORMATION    ENHANCEMENT LAYER    BASE LAYER    FREQUENCY    [FIG. 4]    FROM DOWN-SAMPLER 101-   401 LPC ANALYZER-   402 WEIGHTING SECTION-   403 ADAPTIVE CODE BOOK SEARCH UNIT-   404 ADAPTIVE GAIN QUANTIZER-   405 TARGET VECTOR GENERATOR-   406 NOISE CODE BOOK SEARCH UNIT-   407 NOISE GAIN QUANTIZER-   408 MULTIPLEXER    TO LOCAL DECODER 103 AND MULTIPLEXER 108    [FIG. 5]    FROM SUBTRACTER 106-   501 LPC ANALYZER-   502 SPECTRAL ENVELOPE CALCULATOR-   503 MDCT SECTION-   504 POWER CALCULATOR-   505 POWER NORMALIZER-   506 SPECTRUM NORMALIZER-   507 BARK SCALE SHAPE CALCULATOR-   508 BARK SCALE NORMALIZER-   509 VECTOR QUANTIZER-   510 MULTIPLEXER    TO MULTIPLEXER 108    [FIG. 6]    FROM SUBTRACTER 106-   503 MDCT SECTION-   504 POWER CALCULATOR-   505 POWER NORMALIZER-   506 SPECTRUM NORMALIZER-   507 BARK SCALE SHAPE CALCULATOR-   508 BARK SCALE NORMALIZER-   509 VECTOR QUANTIZER-   510 MULTIPLEXER    TO MULTIPLEXER 108    FROM LOCAL DECODER 103-   601 CONVERSION TABLE-   602 LPC COEFFICIENT MAPPING SECTION-   603 SPECTRAL ENVELOPE CALCULATOR-   604 TRANSFORMATION SECTION    [FIG. 7]    BASE LAYER LPC COEFFICIENTS    APPROXIMATION DETERMINATION.    MAPPING CODE BOOK    ENHANCEMENT LAYER LPC COEFFICIENT CANDIDATES OUTPUT    [FIG. 8]    FROM SUBTRACTER 106-   501 LPC ANALYZER-   502 SPECTRAL ENVELOPE CALCULATOR-   503 MDCT SECTION-   504 POWER CALCULATOR-   505 POWER NORMALIZER-   506 SPECTRUM NORMALIZER-   507 BARK SCALE SHAPE CALCULATOR-   508 BARK SCALE NORMALIZER-   509 VECTOR QUANTIZER-   510 MULTIPLEXER    TO MULTIPLEXER 108    FROM LOCAL DECODER 103-   801 SPECTRAL FINE STRUCTURE CALCULATOR    [FIG. 9]    FROM SUBTRACTER 106-   501 LPC ANALYZER-   502 SPECTRAL ENVELOPE CALCULATOR-   503 MDCT SECTION-   505 POWER NORMALIZER-   506 SPECTRUM NORMALIZER-   507 BARK SCALE SHAPE CALCULATOR-   508 BARK SCALE NORMALIZER-   509 VECTOR QUANTIZER-   510 MULTIPLEXER    TO MULTIPLEXER 108    FROM LOCAL DECODER 103-   901 POWER ESTIMATION UNIT-   902 POWER FLUCTUATION AMOUNT QUANTIZER    [FIG. 10]    CODED DATA (CODED SIGNAL)-   1001 DEMULTIPLEXER-   1002 BASE LAYER DECODER-   1003 UP-SAMPLER-   1004 ENHANCEMENT LAYER DECODER-   1005 DECODING RESULT    [FIG. 11]    FROM DEMULTIPLEXER 1001-   1101 DEMULTIPLEXER-   1102 EXCITATION GENERATOR-   1103 SYNTHESIS FILTER    TO UP-SAMPLER 1003 AND ENHANCEMENT LAYER DECODER 1004    [FIG. 12]    FROM DEMULTIPLEXER 1001-   1201 DEMULTIPLEXER-   1202 LPC COEFFICIENT DECODER-   1203 SPECTRAL ENVELOPE CALCULATOR-   1204 VECTOR DECODER-   1205 BARK SCALE SHAPE DECODER-   1208 POWER DECODER-   1210 IMDCT SECTION    TO ADDER 1005    [FIG. 13]    FROM DEMULTIPLEXER 1001-   1201 DEMULTIPLEXER-   1204 VECTOR DECODER-   1205 BARK SCALE SHAPE DECODER-   1208 POWER DECODER-   1210 IMDCT SECTION    TO ADDER 1005    FROM BASE LAYER DECODER 1002-   1301 CONVERSION TABLE-   1302 LPC COEFFICIENT MAPPING SECTION-   1303 SPECTRAL ENVELOPE CALCULATOR-   1304 TRANSFORMATION SECTION    [FIG. 14]    FROM DEMULTIPLEXER 1001-   1201 DEMULTIPLEXER-   1202 LPC COEFFICIENT DECODER-   1203 SPECTRAL ENVELOPE CALCULATOR-   1204 VECTOR DECODER-   1205 BARK SCALE SHAPE DECODER-   1208 POWER DECODER-   1210 IMDCT SECTION    TO ADDER 1005    FROM BASE LAYER DECODER 1002-   1401 SPECTRAL FINE STRUCTURE CALCULATOR    [FIG. 15]    FROM DEMULTIPLEXER 1001-   1201 DEMULTIPLEXER-   1202 LPC COEFFICIENT DECODER-   1203 SPECTRAL ENVELOPE CALCULATOR-   1204 VECTOR DECODER-   1205 BARK SCALE SHAPE DECODER-   1210 IMDCT SECTION    TO ADDER 1005    FROM BASE LAYER DECODER 1002-   1501 POWER ESTIMATION UNIT-   1502 POWER FLUCTUATION AMOUNT DECODER-   1503 POWER GENERATOR    [FIG. 16]    INPUT SIGNAL-   1601 DOWN-SAMPLER-   1602 BASE LAYER CODER-   1603 LOCAL DECODER-   1604 UP-SAMPLER-   1605 DELAYER-   1607 FREQUENCY DETERMINATION SECTION-   1608 ENHANCEMENT LAYER CODER-   1609 MULTIPLEXER    [FIG. 17]    AMOUNT OF INFORMATION    BACKGROUND MUSIC AND BACKGROUND NOISE INFORMATION    VOICE INFORMATION    FREQUENCY    [FIG. 18]    AMOUNT OF INFORMATION    ENHANCEMENT LAYER    BASE LAYER    FREQUENCY    [FIG. 19]    AMPLITUDE    MASKING M(m)    RESIDUAL ERROR E(m)    FREQUENCY    REGIONS REQUIRING QUANTIZATION    REGIONS NOT REQUIRING QUANTIZATION    [FIG. 20]    FROM UP-SAMPLER 1604-   1901 FFT SECTION-   1902 ESTIMATED AUDITORY MASKING CALCULATOR-   1903 DETERMINATION SECTION    TO ENHANCEMENT LAYER CODER 1608    [FIG. 21]    FROM FFT SECTION 1901-   2001 BARK SPECTRUM CALCULATOR-   2002 SPREAD FUNCTION CONVOLUTION UNIT-   2003 TONALITY CALCULATOR-   2004 AUDITORY MASKING CALCULATOR    TO DETERMINATION SECTION 1903    [FIG. 22]    FROM SUBTRACTER 1606-   2101 MDCT SECTION-   2102 MDCT COEFFICIENT QUANTIZER    TO MULTIPLEXER 1609    FROM FREQUENCY DETERMINATION SECTION 1607    [FIG. 23]    FROM UP-SAMPLER 1604-   2201 MDCT SECTION-   1902 ESTIMATED AUDITORY MASKING CALCULATOR-   1903 DETERMINATION SECTION    TO ENHANCEMENT LAYER CODER 1608    [FIG. 24]    CODED DATA-   2301 DEMULTIPLEXER-   2302 BASE LAYER DECODER-   2303 UP-SAMPLER-   2304 FREQUENCY DETERMINATION SECTION-   2305 ENHANCEMENT LAYER DECODER    [FIG. 25]    FROM FREQUENCY DETERMINATION SECTION 2304    FROM DEMULTIPLEXER 2301-   2401 MDCT COEFFICIENT DECODER-   2402 IMDCT SECTION-   2403 SUPERIMPOSITION ADDER    TO ADDER 2306    [FIG. 26]    FROM DOWN-SAMPLER 1601-   2501 LPC ANALYZER-   2502 WEIGHTING SECTION-   2503 ADAPTIVE CODE BOOK SEARCH UNIT-   2504 ADAPTIVE GAIN QUANTIZER-   2505 TARGET VECTOR GENERATOR-   2506 NOISE CODE BOOK SEARCH UNIT-   2507 NOISE GAIN QUANTIZER-   2508 MULTIPLEXER    TO LOCAL DECODER 1603 AND MULTIPLEXER 1609    [FIG. 27]    FROM DEMULTIPLEXER 2301-   2601 DEMULTIPLEXER-   2602 EXCITATION GENERATOR-   2603 SYNTHESIS FILTER    TO UP-SAMPLER 2303    [FIG. 28]    FROM DEMULTIPLEXER 2301-   2601 DEMULTIPLEXER-   2602 EXCITATION GENERATOR-   2603 COMBINING FILTER-   2701 POST-FILTER    TO UP-SAMPLER 2303    [FIG. 29]    FROM UP-SAMPLER 1604-   1901 FFT SECTION-   1902 ESTIMATED AUDITORY MASKING CALCULATOR-   2801 ESTIMATED ERROR SPECTRUM CALCULATOR-   2802 DETERMINATION SECTION    TO ENHANCEMENT LAYER CODER 1608    [FIG. 30]    AMPLITUDE    FREQUENCY-   P(m): BASE LAYER DECODED SIGNAL SPECTRUM-   E(m): ERROR SPECTRUM-   E′(m): ESTIMATED ERROR SPECTRUM    [FIG. 31]    FROM UP-SAMPLER 1604-   1901 FFT SECTION-   1902 ESTIMATED AUDITORY MASKING CALCULATOR-   3001 ESTIMATED AUDITORY MASKING CORRECTION SECTION    FROM LOCAL DECODER 1603-   3002 DETERMINATION SECTION    TO ENHANCEMENT LAYER CODER 1608    [FIG. 32]    FROM UP-SAMPLER 1604-   1901 FFT SECTION-   1902 ESTIMATED AUDITORY MASKING CALCULATOR-   2801 ESTIMATED ERROR SPECTRUM CALCULATOR-   3001 ESTIMATED AUDITORY MASKING CORRECTION SECTION    FROM LOCAL DECODER 1603-   3101 DETERMINATION SECTION    TO ENHANCEMENT LAYER CODER 1608    [FIG. 33]    FROM SUBTRACTER 1606-   2101 MDCT SECTION    FROM FREQUENCY DETERMINATION SECTION 1607-   3201 ORDERING SECTION-   3202 MDCT COEFFICIENT QUANTIZER    TO MULTIPLEXER 1609    [FIG. 34]    FREQUENCY (m)    ESTIMATED DISTORTION VALUE D(m)    ORDER    [FIG. 35]    FROM FREQUENCY DETERMINATION SECTION 2304-   3401 ORDERING SECTION    FROM DEMULTIPLEXER 2301-   3402 MDCT COEFFICIENT DECODER-   2402 IMDCT SECTION-   2403 SUPERIMPOSITION ADDER    TO ADDER 2306    [FIG. 36]    FROM SUBTRACTER 1606-   2101 MDCT SECTION    FROM FREQUENCY DETERMINATION SECTION 1607-   3502 MDCT COEFFICIENT QUANTIZER    TO MULTIPLEXER 1609-   3501 FIXED BAND SPECIFICATION SECTION    [FIG. 37]    FROM FREQUENCY DETERMINATION SECTION 2304    FROM DEMULTIPLEXER 2301-   3601 FIXED BAND SPECIFICATION SECTION-   3602 MDCT COEFFICIENT DECODER-   2402 IMDCT SECTION-   2403 SUPERIMPOSITION ADDER    TO ADDER 2306    [FIG. 38]    FROM SUBTRACTER 1606-   2101 MDCT SECTION    FROM FREQUENCY DETERMINATION SECTION 1607-   3201 ORDERING SECTION-   3701 MDCT COEFFICIENT QUANTIZER    TO MULTIPLEXER 1609-   3501 FIXED BAND SPECIFICATION SECTION    [FIG. 39]    FROM FREQUENCY DETERMINATION SECTION 2304-   3401 ORDERING SECTION    FROM DEMULTIPLEXER 2301-   3601 FIXED BAND SPECIFICATION SECTION-   3801 MDCT COEFFICIENT DECODER-   2402 IMDCT SECTION-   2403 SUPERIMPOSITION ADDER    TO ADDER 2306    [FIG. 40]-   3901 INPUT APPARATUS-   3902 A/D CONVERSION APPARATUS-   3903 SIGNAL PROCESSING APPARATUS    [FIG. 41]-   4002 RECEIVING APPARATUS-   4003 SIGNAL PROCESSING APPARATUS-   4004 D/A CONVERSION APPARATUS-   4005 OUTPUT APPARATUS    [FIG. 42]-   4101 INPUT APPARATUS-   4102 A/D CONVERSION APPARATUS-   4103 SIGNAL PROCESSING APPARATUS-   4104 RF MODULATION APPARATUS    [FIG. 43]-   4202 RF DEMODULATION APPARATUS-   4203 SIGNAL PROCESSING APPARATUS-   4204 D/A CONVERSION APPARATUS-   4205 OUTPUT APPARATUS

1. A coding apparatus comprising: a down-sampling section that lowers asampling rate of an input signal; a base layer coding section thatencodes an input signal of which sampling rate is lowered and obtainsfirst coding information; a decoding section that generates a decodedsignal based on said first coding information; an up-sampling sectionthat raises a sampling rate of said decoded signal to a rate identicalto that of said input signal; an enhancement layer coding section thatuses a parameter generated in decoding processing of said decodingsection, encodes a difference value between said input signal and saiddecoded signal of which sampling rate is raised, and obtains secondcoding information; and a multiplexing section that multiplexes saidfirst coding information and said second coding information.
 2. Thecoding apparatus according to claim 1, wherein said base layer codingsection encodes an input signal using code excited linear prediction. 3.The coding apparatus according to claim 1, wherein said enhancementlayer coding section encodes an input signal using orthogonaltransformation.
 4. The coding apparatus according to claim 3, whereinsaid enhancement layer coding section encodes an input signal using MDCTprocessing.
 5. The coding apparatus according to claim 1 through claim4, wherein said enhancement layer coding section performs codingprocessing using the base layer LPC coefficients generated in decodingprocessing of said decoding section.
 6. The coding apparatus accordingto claim 5, wherein said enhancement layer coding section converts thebase layer LPC coefficients to the enhancement layer LPC coefficientsbased on a preset conversion table, calculates a spectral envelope basedon the enhancement layer LPC coefficients, and uses said spectralenvelope in at least one of spectrum normalization or vectorquantization in coding processing.
 7. The coding apparatus according toclaim 1, wherein said enhancement layer coding section performs codingprocessing using a pitch period and pitch gain generated in decodingprocessing of said decoding section.
 8. The coding apparatus accordingto claim 7, wherein said enhancement layer coding section calculates aspectral fine structure using a pitch period and pitch gain, and usessaid spectral fine structure in spectrum normalization and vectorquantization in coding processing.
 9. The coding apparatus according toclaim 1, wherein said enhancement layer coding section performs codingprocessing using power of a decoded signal generated by said decodingsection.
 10. The coding apparatus according to claim 9, wherein saidenhancement layer coding section quantizes a fluctuation amount of powerof MDCT coefficients based on power of a decoded signal, and uses saidquantized MDCT coefficient power fluctuation amount in powernormalization in coding processing.
 11. The sound coding apparatusaccording to claim 1, further comprising: a subtraction section thatobtains an error signal from a difference between an input signal at thetime of input and a decoded signal of which sampling rate is raised; anda frequency determination section that determines the frequenciessubject to coding of said error signal based on a decoded signal ofwhich sampling rate is raised; wherein said enhancement layer codingsection encodes said error signal at said frequencies.
 12. The soundcoding apparatus according to claim 11, further comprising an auditorymasking section that calculates auditory masking that indicates anamplitude value that does not contribute to hearing; wherein saidenhancement layer coding section determines an object of coding so thata signal within said auditory masking is not made an object of coding insaid frequency determination section and encodes an error spectrum thatis a spectrum of said error signal.
 13. The sound coding apparatusaccording to claim 12, wherein: said auditory masking section comprises:a frequency domain transformation section that transforms a decodedsignal of which sampling rate is raised to frequency domaincoefficients; an estimated auditory masking calculation section thatcalculates estimated-auditory masking using said frequency domaincoefficients; and a determination section that finds a frequency atwhich an amplitude value of a spectrum of said decoded signal exceeds anamplitude value of said estimated auditory masking; and said enhancementlayer coding section encodes said error spectrum located at saidfrequency.
 14. The sound coding apparatus according to claim 13,wherein: said auditory masking section comprises an estimated errorspectrum calculation section that calculates an estimated error spectrumusing said frequency domain coefficients; and said determination sectionfinds the frequencies at which an amplitude value of said estimatederror spectrum exceeds an amplitude value of said estimated auditorymasking.
 15. The sound coding apparatus according to claim 13, wherein:said auditory masking section comprises a correction section thatsmoothes estimated auditory masking calculated by said estimatedauditory masking calculation section; and said determination sectionfinds the frequencies at which an amplitude value of said decoded signalspectrum or said estimated error spectrum exceeds an amplitude value ofsmoothed said estimated auditory masking.
 16. The sound coding apparatusaccording to claim 13, wherein said enhancement layer coding sectioncalculates for each frequency an amplitude value difference betweeneither an estimated error spectrum or error spectrum and either auditorymasking or estimated auditory masking, and determines an amount ofcoding information based on the amount of said amplitude valuedifference.
 17. The sound coding apparatus according to claim 13,wherein said enhancement layer coding section encodes said errorspectrum in a predetermined band in addition to the frequencies found bysaid determination section.
 18. A decoding apparatus comprising: a baselayer decoding section that decodes first coding information in which aninput signal is coded in predetermined base frame units by a coding sideand obtains a first decoded signal; an enhancement layer decodingsection that decodes second coding information and obtains a seconddecoded signal; an up-sampling section that raises a sampling rate ofsaid first decoded signal to a rate identical to that of said seconddecoded signal; and an addition section that adds said first decodedsignal of which sampling rate is raised and said second decoded signal.19. The decoding apparatus according to claim 18, wherein said baselayer decoding section decodes first coding information generated bycode excited linear prediction.
 20. The decoding apparatus according toclaim 18, wherein said enhancement layer decoding section decodes secondcoding information using orthogonal transformation.
 21. The decodingapparatus according to claim 20, wherein said enhancement layer decodingsection decodes second coding information using inverse MDCT processing.22. The decoding apparatus according to claim 18, wherein saidenhancement layer decoding section decodes second coding informationusing the base layer LPC coefficients.
 23. The decoding apparatusaccording to claim 22, wherein said enhancement layer decoding sectionconverts the base layer LPC coefficients to the enhancement layer LPCcoefficients based on a preset conversion table, calculates a spectralenvelope based on the enhancement layer LPC coefficients, and uses saidspectral envelope in vector decoding in decoding processing.
 24. Thedecoding apparatus according to claim 18, wherein said enhancement layerdecoding section performs decoding processing using at least one ofpitch period or pitch gain.
 25. The decoding apparatus according toclaim 24, wherein said enhancement layer decoding section calculates aspectral fine structure using a pitch period and pitch gain, and usessaid spectral fine structure in vector decoding in decoding processing.26. The decoding apparatus according to claim 24, wherein saidenhancement layer decoding section performs decoding processing usingpower of a decoded signal generated by said decoding section.
 27. Thedecoding apparatus according to claim 26, wherein said enhancement layerdecoding section decodes a fluctuation amount of power of MDCTcoefficients based on power of a decoded signal, and uses said decodedMDCT coefficients power fluctuation amount in power normalization indecoding processing.
 28. The sound decoding apparatus according to claim18, further comprising a frequency determination section that determinesthe frequencies subject to decoding of second coding information inwhich a residual error signal of an input signal and a signal resultingfrom decoding of first coding information is coded by a coding sidebased on said up-sampled first decoded signal; wherein: said enhancementlayer decoding section decodes said second coding information using saidfrequency information and generates a second decoded signal; and saidaddition section adds said second decoded signal and a first decodedsignal of which sampling rate is raised.
 29. The sound decodingapparatus according to claim 28, further comprising an auditory maskingsection that calculates auditory masking that indicates an amplitudevalue that does not contribute to hearing; wherein said enhancementlayer decoding section determines an object of decoding so that a signalwithin said auditory masking is not made an object of decoding in saidfrequency determination section.
 30. The sound decoding apparatusaccording to claim 29, wherein: said auditory masking section comprises:a frequency domain transformation section that transforms a base layerdecoded signal of which sampling rate is raised to frequency domaincoefficients; an estimated auditory masking calculation section thatcalculates estimated auditory masking using said frequency domaincoefficients; and a determination section that finds the frequencies atwhich an amplitude value of a spectrum of said decoded signal exceeds anamplitude value of said estimated auditory masking; and said enhancementlayer decoding section decodes said error spectrum located at saidfrequencies.
 31. The sound decoding apparatus according to claim 30,wherein: said auditory masking section comprises an estimated errorspectrum calculation section that calculates an estimated error spectrumusing said frequency domain coefficients; and said determination sectionfinds the frequencies at which an amplitude value of said estimatederror spectrum exceeds an amplitude value of said estimated auditorymasking.
 32. The sound decoding apparatus according to claim 30,wherein: said auditory masking section comprises a correction sectionthat smoothes estimated auditory masking calculated by said estimatedauditory masking calculation section; and said determination sectionfinds the frequencies at which an amplitude value of said decoded signalspectrum or said estimated error spectrum exceeds an amplitude value ofsmoothed said estimated auditory masking.
 33. The sound decodingapparatus according to claim 29, wherein said enhancement layer decodingsection calculates for each frequency an amplitude value differencebetween either an estimated error spectrum or error spectrum and eitherauditory masking or estimated auditory masking, and determines an amountof decoding information based on the amount of said amplitude valuedifference.
 34. The sound decoding apparatus according to claim 29,wherein said enhancement layer decoding section decodes said errorspectrum in a predetermined band in addition to the frequencies found bysaid determination section.
 35. An acoustic signal transmittingapparatus comprising: an acoustic input section that converts anacoustic signal to an electrical signal; an A/D conversion section thatconverts a signal output from said acoustic input section to a digitalsignal; the coding apparatus according to claim 1 that encodes a digitalsignal output from said A/D conversion section; an RF modulation sectionthat modulates coding information output from said coding apparatus to aradio frequency signal; and a transmitting antenna that converts asignal output from said RF modulation section to a radio wave, andtransmits that radio wave.
 36. An acoustic signal receiving apparatuscomprising: a receiving antenna that receives a radio wave; an RFdemodulation section that demodulates a signal received by saidreceiving antenna; the decoding apparatus according to claim 18 thatdecodes information obtained by said RF demodulation section; a D/Aconversion section that converts a signal output from said decodingapparatus to an analog signal; and an acoustic output section thatconverts an electrical signal output from said D/A conversion section toan acoustic signal.
 37. A communication terminal apparatus comprisingthe acoustic signal transmitting apparatus according to claim
 35. 38. Acommunication terminal apparatus comprising the acoustic signalreceiving apparatus according to claim
 36. 39. A base station apparatuscomprising the acoustic signal transmitting apparatus according to claim35.
 40. A base station apparatus comprising the acoustic signalreceiving apparatus according to claim
 36. 41. A coding methodcomprising: a step of lowering a sampling rate of an input signal; astep of coding an input signal of which sampling rate is lowered andobtaining first coding information; a step of generating a decodedsignal based on said first coding information; a step of raising asampling rate of said decoded signal to a rate identical to that of saidinput signal; a step of using a parameter obtained in processing thatgenerates said decoded signal, coding a difference value between saidinput signal and said decoded signal of which sampling rate is raised,and obtaining second coding information; and a step of multiplexing saidfirst coding information and said second coding information.
 42. Adecoding method comprising: a step of decoding first coding informationand obtaining a first decoded signal; a step of decoding second codinginformation and obtaining a second decoded signal; a step of raising asampling rate of said first decoded signal to a rate identical to thatof said second decoded signal; and a step of adding said first signal ofwhich sampling rate is raised and said second signal.